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PREDICTING BREAST CANCER TREATMENT OUTCOME 

Related Applications 

This application claims benefit of priority from U.S. Provisional Patent Application 
60/504,087, filed September 19, 2003, and is a continuation in part of U.S. Patent Application 
5 10/727,100, filed December 2, 2003. Both applications are hereby incorporated by reference in 
their entireties as if fully set forth. 

Field of the Invention 

The invention relates to the identification and use of gene expression profiles, or patterns, 
with clinical relevance to the treatment of breast cancer using tamoxifen (nolvadex) and other 

10 "antiestrogen" agents against breast cancer, including other "selective estrogen receptor 

modulators" ("SERM"s), "selective estrogen receptor downregulators" ("SERD"s), and aromatase 
inhibitors ("AF's). In particular, the invention provides the identities of gene sequences the 
expression of which are correlated with patient survival and breast cancer recurrence in women 
treated with tamoxifen or other "antiestrogen" agents against breast cancer. The gene expression 

15 profiles, whether embodied in nucleic acid expression, protein expression, or other expression 
formats, may be used to select subjects afflicted with breast cancer who will likely respond 
positively to treatment with tamoxifen or another "antiestrogen" agent against breast cancer as well 
as those who will likely be non-responsive and thus candidates for other treatments. The invention 
also provides the identities of three sets of sequences from three genes with expression patterns that 

20 are strongly predictive of responsiveness to tamoxifen and other "antiestrogen" agents against 
breast cancer. 

Background of the Invention 

Breast cancer is by far the most common cancer among women. Each year, more than 
180,000 and 1 million women in the U.S. and worldwide, respectively, are diagnosed with breast 
25 cancer. Breast cancer is the leading cause of death for women between ages 50-55, and is the most 
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common non-preventable malignancy in women in the Western Hemisphere. An estimated 
2,167,000 women in the United States are currently living with the disease (National Cancer 
Institute, Surveillance Epidemiology and End Results (NCI SEER) program, Cancer Statistics 
Review (CSR), www-seer.ims.nci.nih.gov/Publications/CSR1973 (1998)). Based on cancer rates 
5 from 1995 through 1997, a report from the National Cancer Institute (NCI) estimates that about 1 in 
8 women in the United States (approximately 12.8 percent) will develop breast cancer during her 
lifetime (NCI's Surveillance, Epidemiology, and End Results Program (SEER) publication SEER 
Cancer Statistics Review 1973-1997). Breast cancer is the second most common form of cancer, 
after skin cancer, among women in the United States. An estimated 250,100 new cases of breast 

1 0 cancer are expected to be diagnosed in the United States in 2001 . Of these, 192,200 new cases of 
more advanced (invasive) breast cancer are expected to occur among women (an increase of 5% 
over last year), 46,400 new cases of early stage {in situ) breast cancer are expected to occur among 
women (up 9% from last year), and about 1,500 new cases of breast cancer are expected to be 
diagnosed in men (Cancer Facts & Figures 2001 American Cancer Society). An estimated 40,600 

1 5 deaths (40,300 women, 400 men) from breast cancer are expected in 2001 . Breast cancer ranks 
second only to lung cancer among causes of cancer deaths in women. Nearly 86% of women who 
are diagnosed with breast cancer are likely to still be alive five years later, though 24% of them will 
die of breast cancer after 10 years, and nearly half (47%) will die of breast cancer after 20 years. 

Every woman is at risk for breast cancer. Over 70 percent of breast cancers occur in women 

20 who have no identifiable risk factors other than age (U.S. General Accounting Office. Breast 

Cancer, 1971-1991: Prevention, Treatment and Research. GAO/PEMD-92-12; 1991). Only 5 to 
10% of breast cancers are linked to a family history of breast cancer (Henderson IC, Breast Cancer. 
In: Murphy GP, Lawrence WL, Lenhard RE (eds). Clinical Oncology. Atlanta, GA: American 
Cancer Society; 1995:198-219). 

25 Each breast has 15 to 20 sections called lobes. Within each lobe are many smaller lobules. 

Lobules end in dozens of tiny bulbs that can produce milk. The lobes, lobules, and bulbs are all 
linked by thin tubes called ducts. These ducts lead to the nipple in the center of a dark area of skin 
called the areola. Fat surrounds the lobules and ducts. There are no muscles in the breast, but 
muscles lie under each breast and cover the ribs. Each breast also contains blood vessels and lymph 
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vessels. The lymph vessels carry colorless fluid called lymph, and lead to the lymph nodes. 
Clusters of lymph nodes are found near the breast in the axilla (under the arm), above the 
collarbone, and in the chest. 

Breast tumors can be either benign or malignant. Benign tumors are not cancerous, they do 
5 not spread to other parts of the body, and are not a threat to life. They can usually be removed, and 
in most cases, do not come back. Malignant tumors are cancerous, and can invade and damage 
nearby tissues and organs. Malignant tumor cells may metastasize, entering the bloodstream or 
lymphatic system. When breast cancer cells metastasize outside the breast, they are often found in 
the lymph nodes under the arm (axillary lymph nodes). If the cancer has reached these nodes, it 
10^ means that cancer cells may have spread to other lymph nodes or other organs, such as bones, liver, 
or lungs. 

Major and intensive research has been focused on early detection, treatment and prevention. 
This has included an emphasis on determining the presence of precancerous or cancerous ductal 
epithelial cells. These cells are analyzed, for example, for cell morphology, for protein markers, for 

15 nucleic acid markers, for chromosomal abnormalities, for biochemical markers, and for other 

characteristic changes that would signal the presence of cancerous or precancerous cells. This has 
led to various molecular alterations that have been reported in breast cancer, few of which have 
been well characterized in human clinical breast specimens. Molecular alterations include 
presence/absence of estrogen and progesterone steroid receptors, HER-2 expression/amplification 

20 (Mark HF, et al. HER-2/neu gene amplification in stages I-IV breast cancer detected by fluorescent 
in situ hybridization. Genet Med; 1(3):98-103 1999), Ki-67 (an antigen that is present in all stages 
of the cell cycle except GO and used as a marker for tumor cell proliferation, and prognostic 
markers (including oncogenes, tumor suppressor genes, and angiogenesis markers) like p53, p27, 
Cathepsin D, pS2, multi-drug resistance (MDR) gene, and CD31. 

25 Tamoxifen is the antiestrogen agent most frequently prescribed in women with both early 

stage and metastatic hormone receptor-positive breast cancer (for reviews, see Clarke, R. et al. 
"Antiestrogen resistance in breast cancer and the role of estrogen receptor signaling." Oncogene 22, 
7316-39 (2003) and Jordan, C. "Historical perspective on hormonal therapy of advanced breast 
Cancer." Clin. Ther . 24 Suppl A, A3-16 (2002)). In the adjuvant setting, tamoxifen therapy results 
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in a 40-50% reduction in the annual risk of recurrence, leading to a 5.6% improvement in 10 year 
survival in lymph node negative patients, and a corresponding 10.9% improvement in node-positive 
patients (Group, E.B.C.T.C. Tamoxifen for early breast cancer. Cochrane Database Syst Rev, 
CD000486 (2001)). Tamoxifen is thought to act primarily as a competitive inhibitor of estrogen 
5 binding to estrogen receptor (ER). The absolute levels of ER expression, as well as that of the 
progesterone receptor (PR, an indicator of a functional ER pathway), are currently the best 
predictors of tamoxifen response in the clinical setting (Group, (2001) and Bardou, V.J. et al. 
"Progesterone receptor status significantly improves outcome prediction over estrogen receptor 
status alone for adjuvant endocrine therapy in two large breast cancer databases." J Clin Oncol 21, 
10 1973-9(2003)). 

However, 25% of ER+/PR+ tumors, 66% of ER+/PR- cases and 55% of ER-/PR+ cases fail 
to respond, or develop early resistance to tamoxifen, through mechanisms that remain largely 
unclear (see Clarke et al.; Nicholson, R.I. et al. "The biology of antihormone failure in breast 
cancer." Breast Cancer Res Treat 80 Suppl 1, S29-34; discussion S35 (2003) and Osborne, C.K. et 

15 al. "Growth factor receptor cross-talk with estrogen receptor as a mechanism for tamoxifen 

resistance in breast cancer." Breast 12, 362-7 (2003)). Currently, no reliable means exist to allow 
the identification of these non-responders. In these patients, the use of alternative hormonal 
therapies, such as the aromatase inhibitors letrozole and anastrozole (Ellis, M.J. et al. "Letrozole is 
more effective neoadjuvant endocrine therapy than tamoxifen for ErbB-1- and/or ErbB-2-positive, 

20 estrogen receptorpositive primary breast cancer: evidence from a phase III randomized trial." J Clin 
Oncol 19, 3808-16 (2001); Buzdar, A.U. "Anastrozole: a new addition to the armamentarium 
against advanced breast cancer." Am J Clin Oncol 21, 161-6 (1998); and Goss, P.E. et al. "A 
randomized trial of letrozole in postmenopausal women after five years of tamoxifen therapy for 
early-stage breast cancer." N Engl J Med 349, 1793-802 (2003)); chemotherapeutic agents, or 

25 inhibitors of other signaling pathways, such as trastuzmab and gefitinib might offer the possibility 
of improving clinical outcome. Therefore, the ability to accurately predict tamoxifen treatment 
outcome should significantly advance the management of early stage breast cancer by identifying 
patients who are unlikely to benefit from TAM so that additional or alternative therapies may be 
sought. 
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Citation of documents herein is not intended as an admission that any is pertinent prior art. 
All statements as to the date or representation as to the contents of documents is based on the 
information available to the applicant and does not constitute any admission as to the correctness of 
the dates or contents of the documents. 

5 Summary of the Invention 

The present invention relates to the identification and use of gene expression patterns (or 
profiles or "signatures") and the expression levels of individual gene sequences which are clinically 
relevant to breast cancer. In particular, the identities of genes that are correlated with patient 
survival and breast cancer recurrence (e.g. metastasis of the breast cancer) are provided. The gene 

10 expression profiles, whether embodied in nucleic acid expression, protein expression, or other 

expression formats, may be used to predict survival of subjects afflicted with breast cancer and the 
likelihood of breast cancer recurrence, including cancer metastasis. 

The invention thus provides for the identification and use of gene expression patterns (or 
profiles or "signatures") and the expression levels of individual gene sequences which correlate 

15 with (and thus are able to discriminate between) patients with good or poor survival outcomes. In 
one embodiment, the invention provides patterns that are able to distinguish patients with estrogen 
receptor (a isoform) positive (ER+) breast tumors into those with that are responsive, or likely to be 
responsive, to treatment with tamoxifen (TAM) or another "antiestrogen" agent against breast 
cancer (such as a "selective estrogen receptor modulator" ("SERM"), "selective estrogen receptor 

20 downregulator" ("SERD"), or aromatase inhibitor ("AI")) and those that are non-responsive, or 

likely to be non-responsive, to such treatment. In an alternative embodiment, the invention may be 
applied to patients with breast tumors that do not display detectable levels of ER expression (so 
called "ER-" subjects) but where the patient will nonetheless benefit from application of the 
invention due to the presence of some low level ER expression. Responsiveness may be viewed in 

25 terms of better survival outcomes over time. These patterns are thus able to distinguish patients 
with ER+ breast tumors into at least two subtypes. 

In a first aspect, the present invention provides a non-subjective means for the identification 
of patients with breast cancer (ER+ or ER-) as likely to have a good or poor survival outcome 
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following treatment with TAM or another "antiestrogen" agent against breast cancer by assaying for 
the expression patterns disclosed herein. Thus where subjective interpretation may have been 
previously used to determine the prognosis and/or treatment of breast cancer patients, the present 
invention provides objective gene expression patterns, which may used alone or in combination 
5 with subjective criteria to provide a more accurate assessment of ER+ or ER- breast cancer patient 
outcomes or expected outcomes, including survival and the recurrence of cancer, following 
treatment with TAM or another "antiestrogen" agent against breast cancer. The expression patterns 
of the invention thus provide a means to determine ER+ or ER- breast cancer prognosis. 
Furthermore, the expression patterns can also be used as a means to assay small, node negative 
1 0 tumors that are not readily assayed by other means. 

The gene expression patterns comprise one or more than one gene capable of discriminating 
between breast cancer outcomes with significant accuracy. The gene sequence(s) are identified as 
correlated with ER+ breast cancer outcomes such that the levels of their expression are relevant to a 
determination of the preferred treatment protocols for a patient, whether ER+ or ER-. Thus in one 
15 embodiment, the invention provides a method to determine the outcome of a subject afflicted with 
breast cancer by assaying a cell containing sample from said subject for expression of one or more 
than one gene disclosed herein as correlated with breast cancer outcomes following treatment with 
TAM or another "antiestrogen" agent against breast cancer. 

The ability to correlate gene expression with breast cancer outcome and responsiveness to 
20 TAM is particularly advantageous in light of the possibility that up to 40% of ER+ subjects that 
undergo TAM treatment are non-responders. Therefore, the ability to identify, with confidence, 
these non-responders at an early time point permits the consideration and/or application of 
alternative therapies (such as a different "antiestrogen" agent against breast cancer or other anti- 
breast cancer treatments) to the non-responders. Stated differently, the ability to identify TAM non- 
25 responder subjects permits medical personnel to consider and/or utilize alternative therapies for the 
treatment of the subjects before time is spent on ineffective TAM therapy. Time spent on an 
ineffective therapy often permits further cancer growth, and the likelihood of success with 
alternative therapies diminishes over time given such growth. Therefore, the invention also 
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provides methods to improve the survival outcome of non-responders by use of the methods 
disclosed herein to identify non-responders for treatment with alternative therapies. 

Gene expression patterns of the invention are identified as described below. Generally, a 
large sampling of the gene expression profile of a sample is obtained through quantifying the 
5 expression levels of mRNA corresponding to many genes. This profile is then analyzed to identify 
genes, the expression of which are positively, or negatively, correlated, with ER+ breast cancer 
outcome upon treatment with TAM or another "antiestrogen" agent against breast cancer. An 
expression profile of a subset of human genes may then be identified by the methods of the present 
invention as correlated with a particular outcome. The use of multiple samples increases the 

10 confidence which a gene may be believed to be correlated with a particular survival outcome. 

Without sufficient confidence, it remains unpredictable whether expression of a particular gene is 
actually correlated with an outcome and also unpredictable whether expression of a particular gene 
may be successfully used to identify the outcome for a breast cancer patient. While the invention 
may be practiced based on the identities of the gene sequences disclosed herein or the actual 

1 5 sequences used independent of identification, the invention may also be practiced with any other 
sequences the expression of which is correlated with the expression of sequences disclosed herein. 
Such additional sequences may be identified by any means known in the art, including the methods 
disclosed herein. 

A profile of genes that are highly correlated with one outcome relative to another may be 
20 used to assay an sample from a subject afflicted with breast cancer to predict the likely 

responsiveness (or lack thereof) to TAM or another "antiestrogen" agent against breast cancer in the 
subject from whom the sample was obtained. Such an assay may be used as part of a method to 
determine the therapeutic treatment for said subject based upon the breast cancer outcome 
identified. 

25 As discussed below, the correlated genes may be used singly with significant accuracy or in 

combination to increase the ability to accurately correlating a molecular expression phenotype with 
a breast cancer outcome. This correlation is a way to molecularly provide for the determination of 
survival outcomes as disclosed herein. Additional uses of the correlated gene(s) are in the 
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classification of cells and tissues; determination of diagnosis and/or prognosis; and determination 
and/or alteration of therapy. 

The ability to discriminate is conferred by the identification of expression of the individual 
genes as relevant and not by the form of the assay used to determine the actual level of expression. 
5 An assay may utilize any identifying feature of an identified individual gene as disclosed herein as 
long as the assay reflects, quantitatively or qualitatively, expression of the gene in the 
"transcriptome" (the transcribed fraction of genes in a genome) or the "proteome" (the translated 
fraction of expressed genes in a genome). Additional assays include those based on the detection of 
polypeptide fragments of the relevant member or members of the proteome. Identifying features 

10 include, but are not limited to, unique nucleic acid sequences used to encode (DNA), or express 

(RNA), said gene or epitopes specific to, or activities of, a protein encoded by said gene. All that is 
required are the gene sequence(s) necessary to discriminate between breast cancer outcomes and an 
appropriate cell containing sample for use in an expression assay. 

In another embodiment, the invention provides for the identification of the gene expression 

15 patterns by analyzing global, or near global, gene expression from single cells or homogenous cell 
populations which have been dissected away from, or otherwise isolated or purified from, 
contaminating cells beyond that possible by a simple biopsy. Because the expression of numerous 
genes fluctuate between cells from different patients as well as between cells from the same patient 
sample, multiple data from expression of individual genes and gene expression patterns are used as 

20 reference data to generate models which in turn permit the identification of individual gene(s), the 
expression of which are most highly correlated with particular breast cancer outcomes. 

In additional embodiments, the invention provides physical and methodological means for 
detecting the expression of gene(s) identified by the models generated by individual expression 
patterns. These means may be directed to assaying one or more aspects of the DNA template(s) 

25 underlying the expression of the gene(s), of the RNA used as an intermediate to express the gene(s), 
or of the proteinaceous product expressed by the gene(s). 

In further embodiments, the gene(s) identified by a model as capable of discriminating 
between breast cancer outcomes may be used to identify the cellular state of an unknown sample of 
cell(s) from the breast. Preferably, the sample is isolated via non-invasive means. The expression 
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of said gene(s) in said unknown sample may be determined and compared to the expression of said 
gene(s) in reference data of gene expression patterns correlated with breast cancer outcomes. 
Optionally, the comparison to reference samples may be by comparison to the model(s) constructed 
based on the reference samples. 
5 One advantage provided by the present invention is that contaminating, non-breast cells 

(such as infiltrating lymphocytes or other immune system cells) are not present to possibly affect 
the genes identified or the subsequent analysis of gene expression to identify the survival outcomes 
of patients with breast cancer. Such contamination is present where a biopsy is used to generate 
gene expression profiles. However, and as noted herein, the invention includes the identity of genes 

10 that may be used with significant accuracy even in the presence of contaminating cells. 

In a second aspect, the invention provides a non-subjective means based on the expression 
of three genes, or combinations thereof, for the identification of patients with breast cancer as likely 
to have a good or poor survival outcome following treatment with TAM or another "antiestrogen" 
agent against breast cancer. These three genes are members of the expression patterns disclosed 

15 herein which have been found to be strongly predictive of clinical outcome following TAM 
treatment of ER+ breast cancer. 

The present invention thus provides gene sequences identified as differentially expressed in 
ER+ breast cancer in correlation to TAM responsiveness. The sequences of two of the genes 
display increased expression in ER+ breast cells that respond to TAM treatment (and thus lack of 

20 increased expression in nonresponsive cases). The sequences of the third gene display decreased 
expression in ER+ breast cells that respond to TAM treatment (and thus lack of decreased 
expression in nonresponsive cases). 

The first set of sequences found to be more highly expressed in TAM responsive, ER+ 
breast cells are those of interleukin 17 receptor B (IL17RB), which has been mapped to human 

25 chromosome 3 at 3p21.1. IL17RB is also referred to as interleukin 17B receptor (IL17BR) and 
sequences corresponding to it, and thus may be used in the practice of the instant invention, are 
identified by UniGene Cluster Hs.5470. 

The second set of sequences found to be more highly expressed in TAM responsive, ER+ 
breast cells are those of the calcium channel, voltage-dependent, L type, alpha ID subunit 
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(CACNA1D), which has been mapped to human chromosome 3 at 3pl4.3. Sequences 
corresponding to CACNA1D, and thus may be used in the practice of the instant invention, are 
identified by UniGene Cluster Hs.399966. 

The set of sequences found to be expressed at lower levels in TAM responsive, ER+ breast 
5 cells are those of homeobox B13 (HOXB13), which has been mapped to human chromosome 17 at 
17q21.2. Sequences corresponding to HOXB13, and thus may be used in the practice of the instant 
invention, are identified by UniGene Cluster Hs.66731. 

While the invention may be practiced based on the identities of these three gene sequences 
or the actual sequences used independent of the assigned identity, the invention may also be 

10 practiced with any other sequence the expression of which is correlated with the expression of these 
disclosed sequences. Such additional sequences may be identified by any means known in the art, 
including the methods disclosed herein. 

The identified sequences may thus be used in methods of determining the responsiveness, or 
non-responsiveness, of a subject's ER+ or ER- breast cancer to TAM treatment, or treatment with 

15 another "antiestrogen" agent against breast cancer, via analysis of breast cells in a tissue or cell 
containing sample from a subject. As non-limiting examples, the lack of increased expression of 
IL17BR and CACNA1D sequences and/or the lack of decreased expression of HOXB13 sequences 
may be used as an indicator of nonresponsive cases. The present invention provides an non- 
empirical means for determining responsiveness to TAM or another SERM in ER+ or ER- patients. 

20 This provides advantages over the use of a "wait and see" approach following treatment with TAM 
or other "antiestrogen" agent against breast cancer. The expression levels of these sequences may 
also be used as a means to assay small, node negative tumors that are not readily assessed by 
conventional means. 

The expression levels of the identified sequences may be used alone or in combination with 
25 other sequences capable of determining responsiveness to treatment with TAM or another 

"antiestrogen" agent against breast cancer. Preferably, the sequences of the invention are used 
alone or in combination with each other, such as in the format of a ratio of expression levels that 
can have improved predictive power over analysis based on expression of sequences corresponding 
to individual genes. The invention provides for ratios of the expression level of a sequence that is 
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underexpressed to the expression level of a sequence that is overexpressed as a indicator of 
responsiveness or non-responsiveness. 

The present invention provides means for correlating a molecular expression phenotype with 
a physiological response in a subject with ER+ or ER- breast cancer. This correlation provides a 
5 way to molecularly diagnose and/or determine treatment for a breast cancer afflicted subject. 

Additional uses of the sequences are in the classification of cells and tissues; and determination of 
diagnosis and/or prognosis. Use of the sequences to identify cells of a sample as responsive, or not, 
to treatment with TAM or other "antiestrogen" agent against breast cancer may be used to 
determine the choice, or alteration, of therapy used to treat such cells in the subject, as well as the 

10 subject itself, from which the sample originated. 

Such methods of the invention may be used to assist the determination of providing 
tamoxifen or another "antiestrogen" agent against breast cancer as a chemopreventive or 
chemoprotective agent to a subject at high risk for development of breast cancer. These methods of 
the invention are an advance over the studies of Fabian et al. ( J Natl Cancer Inst . 92(15):1217-27, 

15 2000), which proposed a combination of cytomorphology and the Gail risk model to identify high 
risk patients. The methods may be used in combination with assessments of relative risk of breast 
cancer such as that discussed by Tan-Chiu et al. ( J Natl Cancer Inst . 95(4):302-307, 2003). Non- 
limiting examples include assaying of minimally invasive sampling, such as random (periareolar) 
fine needle aspirates or ductal lavage samples (such as that described by Fabian et al. and optionally 

20 in combination with or as an addition to a mammogram positive for benign or malignant breast 
cancer), of breast cells for the expression levels of gene sequences as disclosed herein to assist in 
the determination of administering therapy with an "antiestrogen" agent against breast cancer, such 
as that which may occur in cases of high risk subjects (like those described by Tan-Chiu et al.). The 
assays would thus lead to the identification of subjects for who the application of an "antiestrogen" 

25 agent against breast cancer would likely be beneficial as a chemopreventive or chemoprotective 
agent. It is contemplated that such application as enabled by the instant invention could lead to 
beneficial effects such as those seen with the administration of tamoxifen (see for example, 
Wickerham D.L., Breast Cancer Res. and Treatment 75 Suppl 1:S7-12, Discussion S33-5, 2000). 
Other applications of the invention include assaying of advanced breast cancer, including metastatic 
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cancer, to determine the responsiveness, or non-responsiveness, thereof to treatment with an 
"antiestrogen" agent against breast cancer. 

An assay of the invention may utilize a means related to the expression level of the 
sequences disclosed herein as long as the assay reflects, quantitatively or qualitatively, expression 
5 of the sequence. Preferably, however, a quantitative assay means is preferred. The ability to 
determine responsiveness to TAM or other "antiestrogen" agent against breast cancer and thus 
outcome of treatment therewith is provided by the recognition of the relevancy of the level of 
expression of the identified sequences and not by the form of the assay used to determine the actual 
level of expression. Identifying features of the sequences include, but are not limited to, unique 

10 nucleic acid sequences used to encode (DNA), or express (RNA), the disclosed sequences or 

epitopes specific to, or activities of, proteins encoded by the sequences. Alternative means include 
detection of nucleic acid amplification as indicative of increased expression levels and nucleic acid 
inactivation, deletion, or methylation, as indicative of decreased expression levels. Stated 
differently, the invention may be practiced by assaying one or more aspect of the DNA template(s) 

15 underlying the expression of the disclosed sequence(s), of the RNA used as an intermediate to 
express the sequence(s), or of the proteinaceous product expressed by the sequence(s), as well as 
proteolytic fragments of such products. As such, the detection of the presence of, amount of, 
stability of, or degradation (including rate) of, such DNA, RNA and proteinaceous molecules may 
be used in the practice of the invention. 

20 The practice of the present invention is unaffected by the presence of minor mismatches 

between the disclosed sequences and those expressed by cells of a subject's sample. A non-limiting 
example of the existence of such mismatches are seen in cases of sequence polymorphisms between 
individuals of a species, such as individual human patients within Homo sapiens. Knowledge that 
expression of the disclosed sequences (and sequences that vary due to minor mismatches) is 

25 correlated with the presence of non-normal or abnormal breast cells and breast cancer is sufficient 
for the practice of the invention with an appropriate cell containing sample via an assay for 
expression. 

In one embodiment, the invention provides for the identification of the expression levels of 
the disclosed sequences by analysis of their expression in a sample containing ER+ or ER- breast 
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cells. In one preferred embodiment, the sample contains single cells or homogenous cell 
populations which have been dissected away from, or otherwise isolated or purified from, 
contaminating cells beyond that possible by a simple biopsy. Alternatively, undissected cells within 
a "section" of tissue may be used. Multiple means for such analysis are available, including 
5 detection of expression within an assay for global, or near global, gene expression in a sample (e.g. 
as part of a gene expression profiling analysis such as on a microarray) or by specific detection, 
such as quantitative PCR (Q-PCR), or real time quantitative PCR. 

Preferably, the sample is isolated via non-invasive or minimally invasive means. The 
expression of the disclosed sequence(s) in the sample may be determined and compared to the 

10 expression of said sequence(s) in reference data of non-normal or cancerous breast cells. 
Alternatively, the expression level may be compared to expression levels in normal or non- 
cancerous cells, preferably from the same sample or subject. In embodiments of the invention 
utilizing Q-PCR, the expression level may be compared to expression levels of reference genes in 
the same sample or a ratio of expression levels may be used. 

1 5 When individual breast cells are isolated in the practice of the invention, one benefit is that 

contaminating, non-breast cells (such as infiltrating lymphocytes or other immune system cells) are 
not present to possibly affect detection of expression of the disclosed sequence(s). Such 
contamination is present where a biopsy is used to generate gene expression profiles. However, 
analysis of differential gene expression and correlation to ER+ breast cancer outcomes with both 

20 isolated and non-isolated samples, as described herein, increases the confidence level of the 

disclosed sequences as capable of having significant predictive power with either type of sample. 

While the present invention is described mainly in the context of human breast cancer, it 
may be practiced in the context of breast cancer of any animal known to be potentially afflicted by 
breast cancer. Preferred animals for the application of the present invention are mammals, 

25 particularly those important to agricultural applications (such as, but not limited to, cattle, sheep, 
horses, and other "farm animals"), animal models of breast cancer, and animals for human 
companionship (such as, but not limited to, dogs and cats). 

The above aspects and embodiments of the invention may be applied equally with respect to 
use of more than one "antiestrogen" agent against breast cancer. In the case of a combination of 
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agents, any combination of more than one SERM, SERD, or AI may be used in place of TAM or 
another "antiestrogen" agent against breast cancer. Aromatase is an enzyme that provides a major 
source of estrogen in body tissues including the breast, liver, muscle and fat. Without being bound 
by theory, and solely provided to assist in a better understanding of the invention, AIs are 
5 understood to function in a manner comparable to TAM and other "antiestrogen" agents against 
breast cancer, which are thought to act as antagonists of estrogen receptor in breast tissues and thus 
as against breast cancer. AIs may be either nonsteroidal or steroidal agents. Examples of the 
former, which inhibit aromatase via the heme prosthetic group) include, but are not limited to, 
anastrozole (arimidex), letrozole (femara), and vorozole (rivisor), which have been used or 

10 contemplated as treatments for metastatic breast cancer. Examples of steroidal AIs, which 

inactivate aromatase, include, but are not limited to, exemestane (aromasin), androstenedione, and 
formestane (lentaron). 

Other forms of therapy to reduce estrogen levels include surgical or chemical ovarian 
ablation. The former is physical removal of the ovaries while the latter is the use of agents to block 

15 ovarian production of estrogen. One non-limiting example of the latter are agonists of gonadotropin 
releasing hormone (GnRH), such as goserelin (zoladex). Of course the instant invention may also 
be practiced with these therapies in place of treatment with one or more "antiestrogen" agent against 
breast cancer. 

The invention disclosed herein is based in part on the performance of a genome-wide 
20 microarray analysis of hormone receptor-positive invasive breast tumors from 60 patients treated 
with adjuvant tamoxifen alone, leading to the identification of a two-gene expression ratio that is 
highly predictive of clinical outcome. This expression ratio, which is readily adapted to PCR-based 
analysis of standard paraffin-embedded clinical specimens, was validated in an independent set of 
20 patients as described below. 
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Brief Description of the Drawings 

Figure 1 shows receiver operating characteristic (ROC) analyses of IL17BR, HOXB13, and 
CACNA1D expression levels as predictors of breast cancer outcomes in whole tissue sections (top 3 
graphs) and laser microdissected cells (bottom 3 graphs). AUC refers to area under the curve. 
5 Figure 2 contains six parts relating to the validation of a ratio of HOXB13 expression to 

IL17BR expression as an indicator of responsiveness, or lack thereof, to TAM. Parts a and b show 
the results of gene expression analysis of HOXB13 and IL17BR sequences by Q-PCR in both 
Responder and Non-responder samples. Plots of the Responder and Non-responder training and 
validation data sets are shown in Parts c and d, where "0" indicates Responder datapoints in both 
10 and "1" indicates Non-responder datapoints in both. Parts e and f show plots of the Responder and 
Non-responder training and validation data sets as a function of survival, where the upper line in 
each Part represents the Responders and the lower line represents the Non-responders. 

Modes of Practicing the Invention 

Definitions of terms as used herein: 

1 5 A gene expression "pattern" or "profile" or "signature" refers to the relative expression of 

genes correlated with responsiveness to treatment of ER+ breast cancer with TAM or another 
"antiestrogen" agent against breast cancer. Responsiveness or lack thereof may be expressed as 
survival outcomes which are correlated with an expression "pattern" or "profile" or "signature" that 
is able to distinguish between, and predict, said outcomes. 

20 A "selective estrogen receptor modulator" or SERM is an "antiestrogen" agent that in some 

tissues act like estrogens (agonist) but block estrogen action in other tissues (antagonist). A 
"selective estrogen receptor downregulators" (or "SERD"s) or "pure" antiestrogens includes agents 
which block estrogen activity in all tissues. See Howell et al. (Best Bractice & Res. Clin. 
Endocrinol. Metab. 18(l):47-66, 2004). Preferred SERMs of the invention are those that are 

25 antagonists of estrogen in breast tissues and cells, including those of breast cancer. Non-limiting 
examples of such include TAM, raloxifene, GW5638, and ICI 182,780. The possible mechanisms 
of action by various SERMs have been reviewed (see for example Jordan et al., 2003, Breast Cancer 
Res. 5:281-283; Hall et al., 2001, J. Biol. Chem. 276(40):36869-36872; Dutertre et al. 2000, J. 
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Pharmacol. Exp. Therap. 295(2):431-437; and Wijayaratne et al., 1999, Endocrinology 
140(12):5828-5840). Other non-limiting examples of SERMs in the context of the invention 
include triphenylethylenes, such as tamoxifen, GW5638, TAT-59, clomiphene, toremifene, 
droloxifene, and idoxifene; benzothiophenes, such as arzoxiphene (LY353381 or LY353381-HC1); 
5 benzopyrans, such as EM-800; naphthalenes, such as CP-336,156; and ERA-923. 

Non-limiting examples of SERD or "pure" antiestrogens include agents such as ICI 182,780 
(fulvestrant or faslodex) or the oral analogue SRI 6243 and ZK 191703 as well as aromatase 
inhibitors and chemical ovarian ablation agents as described herein. 

Other agents encompassed by SERM as used herein include progesterone receptor inhibitors 

10 and related drugs, such as progestomimetics like medroxyprogesterone acetate, megace, and RU- 
486; and peptide based inhibitors of ER action, such as LH-RH analogs (leuprolide, zoladex, [D- 
Trp6]LH-RH), somatostatin analogs, and LXXLL motif mimics of ER as well as tibolone and 
resveratrol. As noted above, preferred SERMs of the invention are those that are antagonist of 
estrogen in breast tissues and cells, including those of breast cancer. Non-limiting examples of 

15 preferred SERMs include the actual or contemplated metabolites (in vivo) of any SERM, such as, 
but not limited to, 4-hydroxytamoxifen (metabolite of tamoxifen), EM652 (or SCH 57068 where 
EM-800 is a prodrug of EM-652), and GW7604 (metabolite of GW5638). See Willson et al. (1997, 
Endocrinology 138(9):3901-391 1) and Dauvois et al. (1992, Proc. Nat'l. Acad. Sci., USA 89:4037- 
4041) for discussions of some specific SERMs. 

20 Other preferred SERMs are those that produce the same relevant gene expression profile as 

tamoxifen or 4-hydroxytamoxifen. One example of means to identify such SERMs is provided by 
Levenson et al. (2002, Cancer Res. 62:4419-4426). 

A "gene" is a polynucleotide that encodes a discrete product, whether RNA or proteinaceous 
in nature. It is appreciated that more than one polynucleotide may be capable of encoding a discrete 

25 product. The term includes alleles and polymorphisms of a gene that encodes the same product, or 
a functionally associated (including gain, loss, or modulation of function) analog thereof, based 
upon chromosomal location and ability to recombine during normal mitosis. 

A "sequence" or "gene sequence" as used herein is a nucleic acid molecule or 
polynucleotide composed of a discrete order of nucleotide bases. The term includes the ordering of 
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bases that encodes a discrete product (i.e. "coding region"), whether RNA or proteinaceous in 
nature, as well as the ordered bases that precede or follow a "coding region". Non-limiting 
examples of the latter include 5' and 3' untranslated regions of a gene. It is appreciated that more 
than one polynucleotide may be capable of encoding a discrete product. It is also appreciated that 
5 alleles and polymorphisms of the disclosed sequences may exist and may be used in the practice of 
the invention to identify the expression level(s) of the disclosed sequences or the allele or 
polymorphism. Identification of an allele or polymorphism depends in part upon chromosomal 
location and ability to recombine during mitosis. 

The terms "correlate" or "correlation" or equivalents thereof refer to an association between 

10 expression of one or more genes and a physiological response of a breast cancer cell and/or a breast 
cancer patient in comparison to the lack of the response. A gene may be expressed at higher or 
lower levels and still be correlated with responsiveness, non-responsiveness or breast cancer 
survival or outcome. The invention provides for the correlation between increases in expression of 
IL17BR and CACNA1D sequences and responsiveness of ER+ breast cells to TAM or another 

1 5 "antiestrogen" agent against breast cancer. Thus increases are indicative of responsiveness. 

Conversely, the lack of increases, including unchanged expression levels, are indicators of non- 
responsiveness. Similarly, the invention provides for the correlation between decreases in 
expression of HOXB13 sequences and responsiveness of ER+ breast cells to TAM or another 
SERM. Thus decreases are indicative of responsiveness while the lack of decreases, including 

20 unchanged expression levels, are indicators of non-responsiveness. Increases and decreases may be 
readily expressed in the form of a ratio between expression in a non-normal cell and a normal cell 
such that a ratio of one (1) indicates no difference while ratios of two (2) and one-half indicate 
twice as much, and half as much, expression in the non-normal cell versus the normal cell, 
respectively. Expression levels can be readily determined by quantitative methods as described 

25 below. 

For example, increases in IL17BR, CACNA1D, or HOXB13 expression can be indicated by 
ratios of or about 1.1, of or about 1.2, of or about 1.3, of or about 1.4, of or about 1.5, of or about 
1.6, of or about 1.7, of or about 1.8, of or about 1.9, of or about 2, of or about 2.5, of or about 3, of 
or about 3.5, of or about 4, of or about 4.5, of or about 5, of or about 5.5, of or about 6, of or about 
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6.5, of or about 7, of or about 7.5, of or about 8, of or about 8.5, of or about 9, of or about 9.5, of or 
about 10, of or about 15, of or about 20, of or about 30, of or about 40, of or about 50, of or about 
60, of or about 70, of or about 80, of or about 90, of or about 100, of or about 150, of or about 200, 
of or about 300, of or about 400, of or about 500, of or about 600, of or about 700, of or about 800, 
5 of or about 900, or of or about 1000. A ratio of 2 is a 100% (or a two-fold) increase in expression. 
Decreases in IL17BR, CACNA1D, or HOXB13 expression can be indicated by ratios of or about 
0.9, of or about 0.8, of or about 0.7, of or about 0.6, of or about 0.5, of or about 0.4, of or about 0.3, 
of or about 0.2, of or about 0.1, of or about 0.05, of or about 0.01, of or about 0.005, of or about 
0.001, of or about 0.0005, of or about 0.0001, of or about 0.00005, of or about 0.00001, of or about 

10 0.000005, or of or about 0.000001. 

For a given phenotype, a ratio of the expression of a gene sequence expressed at increased 
levels in correlation with the phenotype to the expression of a gene sequence expressed at decreased 
levels in correlation with the phenotype may also be used as an indicator of the phenotype. As a 
non-limiting example, the phenotype of non-responsiveness to tamoxifen treatment of breast cancer 

15 is correlated with increased expression of HOXB13 as well as decreased expression of IL17BR and 
CACNA1D. Therefore, a ratio of the expression levels of HOXB13 to IL17BR (or CACNA1D) 
may be used as an indicator of non-responsiveness. 

A "polynucleotide" is a polymeric form of nucleotides of any length, either ribonucleotides 
or deoxyribonucleotides. This term refers only to the primary structure of the molecule. Thus, this 

20 term includes double- and single-stranded DNA and RNA. It also includes known types of 

modifications including labels known in the art, methylation, "caps", substitution of one or more of 
the naturally occurring nucleotides with an analog, and intemucleotide modifications such as 
uncharged linkages (e.g., phosphorothioates, phosphorodithioates, etc.), as well as unmodified 
forms of the polynucleotide. 

25 The term "amplify" is used in the broad sense to mean creating an amplification product can 

be made enzymatically with DNA or RNA polymerases. "Amplification," as used herein, generally 
refers to the process of producing multiple copies of a desired sequence, particularly those of a 
sample. "Multiple copies" mean at least 2 copies. A "copy" does not necessarily mean perfect 
sequence complementarity or identity to the template sequence. Methods for amplifying mRNA are 
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generally known in the art, and include reverse transcription PCR (RT-PCR) and those described in 
U.S. Patent Application 10/062,857 (filed on October 25, 2001), as well as U.S. Provisional Patent 
Applications 60/298,847 (filed June 15, 2001) and 60/257,801 (filed December 22, 2000), all of 
which are hereby incorporated by reference in their entireties as if fully set forth. Another method 
5 which may be used is quantitative PCR (or Q-PCR). Alternatively, RNA may be directly labeled as 
the corresponding cDNA by methods known in the art. 

By "corresponding", it is meant that a nucleic acid molecule shares a substantial amount of 
sequence identity with another nucleic acid molecule. Substantial amount means at least 95%, 
usually at least 98% and more usually at least 99%, and sequence identity is determined using the 

10 BLAST algorithm, as described in Altschul et al. (1990), J. Mol. Biol. 215:403-410 (using the 
published default setting, i.e. parameters w=4, t=17). 

A "microarray" is a linear or two-dimensional or three dimensional (and solid phase) array 
of preferably discrete regions, each having a defined area, formed on the surface of a solid support 
such as, but not limited to, glass, plastic, or synthetic membrane. The density of the discrete regions 

15 on a microarray is determined by the total numbers of immobilized polynucleotides to be detected 
on the surface of a single solid phase support, preferably at least about 50/cm 2 , more preferably at 
least about 100/cm 2 , even more preferably at least about 500/cm 2 , but preferably below about 
1,000/cm 2 . Preferably, the arrays contain less than about 500, about 1000, about 1500, about 2000, 
about 2500, or about 3000 immobilized polynucleotides in total. As used herein, a DNA microarray 

20 is an array of oligonucleotides or polynucleotides placed on a chip or other surfaces used to 
hybridize to amplified or cloned polynucleotides from a sample. Since the position of each 
particular group of primers in the array is known, the identities of a sample polynucleotides can be 
determined based on their binding to a particular position in the microarray. As an alternative to the 
use of a microarray, an array of any size may be used in the practice of the invention, including an 

25 arrangement of one or more position of a two-dimensional or three dimensional arrangement in a 
solid phase to detect expression of a single gene sequence. 

Because the invention relies upon the identification of genes that are over- or under- 
expressed, one embodiment of the invention involves determining expression by hybridization of 
mRNA, or an amplified or cloned version thereof, of a sample cell to a polynucleotide that is unique 
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to a particular gene sequence. Preferred polynucleotides of this type contain at least about 16, at 
least about 18, at least about 20, at least about 22, at least about 24, at least about 26, at least about 
28, at least about 30, or at least about 32 consecutive basepairs of a gene sequence that is not found 
in other gene sequences. The term "about" as used in the previous sentence refers to an increase or 
5 decrease of 1 from the stated numerical value. Even more preferred are polynucleotides of at least 
or about 50, at least or about 100, at least about or 150, at least or about 200, at least or about 250, 
at least or about 300, at least or about 350, at least or about 400, , at least or about 450, or at least or 
about 500 consecutive bases of a sequence that is not found in other gene sequences. The term 
"about" as used in the preceding sentence refers to an increase or decrease of 10% from the stated 

10 numerical value. Longer polynucleotides may of course contain minor mismatches (e.g. via the 
presence of mutations) which do not affect hybridization to the nucleic acids of a sample. Such 
polynucleotides may also be referred to as polynucleotide probes that are capable of hybridizing to 
sequences of the genes, or unique portions thereof, described herein. Such polynucleotides may be 
labeled to assist in their detection. Preferably, the sequences are those of mRNA encoded by the 

15 genes, the corresponding cDNA to such mRNAs, and/or amplified versions of such sequences. In 
preferred embodiments of the invention, the polynucleotide probes are immobilized on an array, 
other solid support devices, or in individual spots that localize the probes. 

In another embodiment of the invention, all or part of a disclosed sequence may be 
amplified and detected by methods such as the polymerase chain reaction (PCR) and variations 

20 thereof, such as, but not limited to, quantitative PCR (Q-PCR), reverse transcription PCR (RT- 

PCR), and real-time PCR (including as a means of measuring the initial amounts of mRNA copies 
for each sequence in a sample), optionally real-time RT-PCR or real-time Q-PCR. Such methods 
would utilize one or two primers that are complementary to portions of a disclosed sequence, where 
the primers are used to prime nucleic acid synthesis. The newly synthesized nucleic acids are 

25 optionally labeled and may be detected directly or by hybridization to a polynucleotide of the 

invention. The newly synthesized nucleic acids may be contacted with polynucleotides (containing 
sequences) of the invention under conditions which allow for their hybridization. Additional 
methods to detect the expression of expressed nucleic acids include RNAse protection assays, 
including liquid phase hybridizations, and in situ hybridization of cells. 
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Alternatively, and in yet another embodiment of the invention, gene expression may be 
determined by analysis of expressed protein in a cell sample of interest by use of one or more 
antibodies specific for one or more epitopes of individual gene products (proteins), or proteolytic 
fragments thereof, in said cell sample or in a bodily fluid of a subject. The cell sample may be one 
5 of breast cancer epithelial cells enriched from the blood of a subject, such as by use of labeled 
antibodies against cell surface markers followed by fluorescence activated cell sorting (FACS). 
Such antibodies are preferably labeled to permit their easy detection after binding to the gene 
product. Detection methodologies suitable for use in the practice of the invention include, but are 
not limited to, immunohistochemistry of cell containing samples or tissue, enzyme linked 
10 immunosorbent assays (ELISAs) including antibody sandwich assays of cell containing tissues or 
blood samples, mass spectroscopy, and immuno-PCR. 

The term "label" refers to a composition capable of producing a detectable signal indicative 
of the presence of the labeled molecule. Suitable labels include radioisotopes, nucleotide 
chromophores, enzymes, substrates, fluorescent molecules, chemiluminescent moieties, magnetic 
15 particles, bioluminescent moieties, and the like. As such, a label is any composition detectable by 
spectroscopic, photochemical, biochemical, immunochemical, electrical, optical or chemical means. 

The term "support" refers to conventional supports such as beads, particles, dipsticks, fibers, 
filters, membranes and silane or silicate supports such as glass slides. 

As used herein, a "breast tissue sample" or "breast cell sample" refers to a sample of breast 
20 tissue or fluid isolated from an individual suspected of being afflicted with, or at risk of developing, 
breast cancer. Such samples are primary isolates (in contrast to cultured cells) and may be collected 
by any non-invasive or minimally invasive means, including, but not limited to, ductal lavage, fine 
needle aspiration, needle biopsy, the devices and methods described in U.S. Patent 6,328,709, or 
any other suitable means recognized in the art. Alternatively, the "sample" may be collected by an 
25 invasive method, including, but not limited to, surgical biopsy. 

"Expression" and "gene expression" include transcription and/or translation of nucleic acid 
material. 

As used herein, the term "comprising" and its cognates are used in their inclusive sense; that 
is, equivalent to the term "including" and its corresponding cognates. 
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Conditions that "allow" an event to occur or conditions that are "suitable" for an event to 
occur, such as hybridization, strand extension, and the like, or "suitable" conditions are conditions 
that do not prevent such events from occurring. Thus, these conditions permit, enhance, facilitate, 
and/or are conducive to the event. Such conditions, known in the art and described herein, depend 
5 upon, for example, the nature of the nucleotide sequence, temperature, and buffer conditions. These 
conditions also depend on what event is desired, such as hybridization, cleavage, strand extension or 
transcription. 

Sequence "mutation," as used herein, refers to any sequence alteration in the sequence of a 
gene disclosed herein interest in comparison to a reference sequence. A sequence mutation includes 

10 single nucleotide changes, or alterations of more than one nucleotide in a sequence, due to 

mechanisms such as substitution, deletion or insertion. Single nucleotide polymorphism (SNP) is 
also a sequence mutation as used herein. Because the present invention is based on the relative 
level of gene expression, mutations in non-coding regions of genes as disclosed herein may also be 
assayed in the practice of the invention. 

15 "Detection" includes any means of detecting, including direct and indirect detection of gene 

expression and changes therein. For example, "detectably less" products may be observed directly 
or indirectly, and the term indicates any reduction (including the absence of detectable signal). 
Similarly, "detectably more" product means any increase, whether observed directly or indirectly. 
Increases and decreases in expression of the disclosed sequences are defined in the 

20 following terms based upon percent or fold changes over expression in normal cells. Increases may 
be of 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 120, 140, 160, 180, or 200% relative to expression 
levels in normal cells. Alternatively, fold increases may be of 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5, 5.5, 6, 
6.5, 7, 7.5, 8, 8.5, 9, 9.5, or 10 fold over expression levels in normal cells. Decreases may be of 10, 
20, 30, 40, 50, 55, 60, 65, 70, 75, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 99 or 100% relative to 

25 expression levels in normal cells. 

Unless defined otherwise all technical and scientific terms used herein have the same 
meaning as commonly understood to one of ordinary skill in the art to which this invention belongs. 
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Embodiments of the Invention 

In a first aspect, the disclosed invention relates to the identification and use of gene 
expression patterns (or profiles or "signatures") which discriminate between (or are correlated with) 
breast cancer survival in a subject treated with tamoxifen (TAM) or another "antiestrogen" agent 
5 against breast cancer. Such patterns may be determined by the methods of the invention by use of a 
number of reference cell or tissue samples, such as those reviewed by a pathologist of ordinary skill 
in the pathology of breast cancer, which reflect breast cancer cells as opposed to normal or other 
non-cancerous cells. The outcomes experienced by the subjects from whom the samples may be 
correlated with expression data to identify patterns that correlate with the outcomes following 
10 treatment with TAM or another "antiestrogen" agent against breast cancer. Because the overall 
gene expression profile differs from person to person, cancer to cancer, and cancer cell to cancer 
cell, correlations between certain cells and genes expressed or underexpressed may be made as 
disclosed herein to identify genes that are capable of discriminating between breast cancer 
outcomes. 

15 The present invention may be practiced with any number of the genes believed, or likely to 

be, differentially expressed with respect to breast cancer outcomes, particularly in cases of ER+ 
breast cancer. The identification may be made by using expression profiles of various homogenous 
breast cancer cell populations, which were isolated by microdissection, such as, but not limited to, 
laser capture microdissection (LCM) of 100-1000 cells. The expression level of each gene of the 

20 expression profile may be correlated with a particular outcome. Alternatively, the expression levels 
of multiple genes may be clustered to identify correlations with particular outcomes. 

Genes with significant correlations to breast cancer survival when the subject is treated with 
tamoxifen may be used to generate models of gene expressions that would maximally discriminate 
between outcomes where a subject responds to treatment with tamoxifen or another "antiestrogen" 

25 agent against breast cancer and outcomes where the treatment is not successful. Alternatively, 

genes with significant correlations may be used in combination with genes with lower correlations 
without significant loss of ability to discriminate between outcomes. Such models may be 
generated by any appropriate means recognized in the art, including, but not limited to, cluster 
analysis, supported vector machines, neural networks or other algorithm known in the art. The 
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models are capable of predicting the classification of a unknown sample based upon the expression 
of the genes used for discrimination in the models. "Leave one out" cross-validation may be used 
to test the performance of various models and to help identify weights (genes) that are 
uninformative or detrimental to the predictive ability of the models. Cross-validation may also be 
5 used to identify genes that enhance the predictive ability of the models. 

The gene(s) identified as correlated with particular breast cancer outcomes relating to 
tamoxifen treatment by the above models provide the ability to focus gene expression analysis to 
only those genes that contribute to the ability to identify a subject as likely to have a particular 
outcome relative to another. The expression of other genes in a breast cancer cell would be 
10 relatively unable to provide information concerning, and thus assist in the discrimination of, a breast 
cancer outcome. 

As will be appreciated by those skilled in the art, the models are highly useful with even a 
small set of reference gene expression data and can become increasingly accurate with the inclusion 
of more reference data although the incremental increase in accuracy will likely diminish with each 

15 additional datum. The preparation of additional reference gene expression data using genes 
identified and disclosed herein for discriminating between different outcomes in breast cancer 
following treatment with tamoxifen or another "antiestrogen" agent against breast cancer is routine 
and may be readily performed by the skilled artisan to permit the generation of models as described 
above to predict the status of an unknown sample based upon the expression levels of those genes. 

20 To determine the (increased or decreased) expression levels of genes in the practice of the 

present invention, any method known in the art may be utilized. In one preferred embodiment of 
the invention, expression based on detection of RNA which hybridizes to the genes identified and 
disclosed herein is used. This is readily performed by any RNA detection or 
amplification+detection method known or recognized as equivalent in the art such as, but not 

25 limited to, reverse transcription-PCR, the methods disclosed in U.S. Patent Application 10/062,857 
(filed on October 25, 2001) as well as U.S. Provisional Patent Applications 60/298,847 (filed June 
15, 2001) and 60/257,801 (filed December 22, 2000), and methods to detect the presence, or 
absence, of RNA stabilizing or destabilizing sequences. 
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Alternatively, expression based on detection of DNA status may be used. Detection of the 
DNA of an identified gene as methylated or deleted may be used for genes that have decreased 
expression in correlation with a particular breast cancer outcome. This may be readily performed 
by PCR based methods known in the art, including, but not limited to, Q-PCR. Conversely, 
5 detection of the DNA of an identified gene as amplified may be used for genes that have increased 
expression in correlation with a particular breast cancer outcome. This may be readily performed 
by PCR based, fluorescent in situ hybridization (FISH) and chromosome in situ hybridization 
(CISH) methods known in the art. 

Expression based on detection of a presence, increase, or decrease in protein levels or 

10 activity may also be used. Detection may be performed by any immunohistochemistry (IHC) based, 
blood based (especially for secreted proteins), antibody (including autoantibodies against the 
protein) based, exfoliate cell (from the cancer) based, mass spectroscopy based, and image 
(including used of labeled ligand) based method known in the art and recognized as appropriate for 
the detection of the protein. Antibody and image based methods are additionally useful for the 

1 5 localization of tumors after determination of cancer by use of cells obtained by a non-invasive 

procedure (such as ductal lavage or fine needle aspiration), where the source of the cancerous cells 
is not known. A labeled antibody or ligand may be used to localize the carcinoma(s) within a 
patient or to assist in the enrichment of exfoliated cancer cells from a bodily fluid. 

A preferred embodiment using a nucleic acid based assay to determine expression is by 

20 immobilization of one or more sequences of the genes identified herein on a solid support, 

including, but not limited to, a solid substrate as an array or to beads or bead based technology as 
known in the art. Alternatively, solution based expression assays known in the art may also be 
used. The immobilized gene(s) may be in the form of polynucleotides that are unique or otherwise 
specific to the gene(s) such that the polynucleotide would be capable of hybridizing to a DNA or 

25 RNA corresponding to the gene(s). These polynucleotides may be the full length of the gene(s) or 
be short sequences of the genes (up to one nucleotide shorter than the full length sequence known in 
the art by deletion from the 5' or 3' end of the sequence) that are optionally minimally interrupted 
(such as by mismatches or inserted non-complementary basepairs) such that hybridization with a 
DNA or RNA corresponding to the gene(s) is not affected. Preferably, the polynucleotides used are 
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from the 3' end of the gene, such as within about 350, about 300, about 250, about 200, about 150, 
about 100, or about 50 nucleotides from the polyadenylation signal or polyadenylation site of a gene 
or expressed sequence. Polynucleotides containing mutations relative to the sequences of the 
disclosed genes may also be used so long as the presence of the mutations still allows hybridization 
5 to produce a detectable signal. 

The immobilized gene(s) may be used to determine the state of nucleic acid samples 
prepared from sample breast cell(s) for which the outcome of the sample's subject (e.g. patient from 
whom the sample is obtained) is not known or for confirmation of an outcome that is already 
assigned to the sample's subject. Without limiting the invention, such a cell may be from a patient 

10 with ER+ or ER- breast cancer. The immobilized polynucleotide(s) need only be sufficient to 

specifically hybridize to the corresponding nucleic acid molecules derived from the sample under 
suitable conditions. While even a single correlated gene sequence may to able to provide adequate 
accuracy in discriminating between two breast cancer outcomes, two or more, three or more, four or 
more, five or more, six or more, seven or more, eight or more, nine or more, ten or more, or eleven 

1 5 or more of the genes identified herein may be used as a subset capable of discriminating may be 
used in combination to increase the accuracy of the method. The invention specifically 
contemplates the selection of more than one, two or more, three or more, four or more, five or more, 
six or more, seven or more, eight or more, nine or more, ten or more, or eleven or more of the genes 
disclosed in the tables and figures herein for use as a subset in the identification of breast cancer 

20 survival outcome. 

Of course 1 or more, 2 or more, 3 or more, 4 or more, 5 or more, 6 or more, 7 or more, 8 or 
more, 9 or more, or all the genes provided in Tables 2 and/or 3 below may be used. "Accession" as 
used in the context of the Tables herein as well as the present invention refers to the GenBank 
accession number of a sequence of each gene, the sequences of which are hereby incorporated by 

25 reference in their entireties as they are available from GenBank as accessed on the filing date of the 
present application. P value refers to values assigned as described in the Examples below. The 
indications of "E-xx" where "xx" is a two digit number refers to alternative notation for exponential 
figures where "E-xx" is "10~ xx ". Thus in combination with the numbers to the left of "E-xx", the 
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value being represented is the numbers to the left times 10" xx . "Description" as used in the Tables 
provides a brief identifier of what the sequence/gene encodes. 

Genes with a correlation identified by a p value below or about 0.02, below or about 0.01, 
below or about 0.005, or below or about 0.001 are preferred for use in the practice of the invention. 
5 The present invention includes the use of gene(s) the expression of which identify different breast 
cancer outcomes after treatment with TAM or another "antiestrogen" agent against breast cancer to 
permit simultaneous identification of breast cancer survival outcome of a patient based upon 
assaying a breast cancer sample from said patient. 

In a second aspect, the present invention relates to the identification and use of three sets of 
10 sequences for the determination of responsiveness of ER+ breast cancer to treatment with TAM or 
another "antiestrogen" agent against breast cancer. The differential expression of these sequences 
in breast cancer relative to normal breast cells is used to predict responsiveness to TAM or another 
"antiestrogen" agent against breast cancer in a subject. 

To identify gene expression patterns in ER positive, early stage invasive breast cancers that 
15 might predict response to hormonal therapy, microarray gene expression analysis was performed on 
tumors from 60 women uniformly treated with adjuvant tamoxifen alone. These patients were 
identified from a total of 103 ER+ early stage cases presenting to Massachusetts General Hospital 
between 1987 and 1997, from whom tumor specimens were snap frozen and for whom minimal 5 
year follow-up was available (see Table 1 for details). Within this cohort, 28 (46%) women 
20 developed distant metastasis with a median time to recurrence of 4 years ("tamoxifen non- 
responders") and 32 (54%) women remained disease-free with median follow-up of 10 years 
("tamoxifen responders"). Responders were matched with non-responder cases with respect to 
TNM staging (see Singletary, S.E. et al. "Revision of the American Joint Committee on Cancer 
staging system for breast cancer." J Clin Oncol 20, 3628-36 (2002)) and tumor grade (see Dalton, 
25 L.W. et al. "Histologic grading of breast cancer: linkage of patient outcome with level of 
pathologist agreement." Mod Pathol 13, 730-5. (2000)). 

Previous studies linking gene expression profiles to clinical outcome in breast cancer have 
demonstrated that the potential for distant metastasis and overall survival probability may be 
predictable through biological characteristics of the primary tumor at the time of diagnosis (see 
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Huang, E. et al. "Gene expression predictors of breast cancer outcomes." Lancet 361, 1590-6 
(2003); Sorlie, T. et al. "Gene expression patterns of breast carcinomas distinguish tumor 
subclasses with clinical implications." Proc Natl Acad Sci U S A 98:10869-74 (2001); Sorlie, T. et 
al. "Repeated observation of breast tumor subtypes in independent gene expression data sets." Proc 
5 Natl Acad Sci U S A 100, 8418-23 (2003); Sotiriou, C. et al. "Breast cancer classification and 

prognosis based on gene expression profiles from a population-based study." Proc Natl Acad Sci U 
S A 100, 10393-8 (2003); van 't Veer, L.J. et al. "Gene expression profiling predicts clinical 
outcome of breast cancer." Nature 415, 530-6 (2002); and van de Vijver, M.J. et al. "A gene- 
expression signature as a predictor of survival in breast cancer." N Engl J Med 347, 1999-2009 

10 (2002)). In particular, a 70-gene expression signature has proven to be a strong prognostic factor, 
out-performing all known clinicopathological parameters. However, in those studies patients either 
received no adjuvant therapy (van f t Veer, L.J. et al. Nature 2002) or were treated non-uniformly 
with hormonal and chemotherapeutic regimens (Huang, E. et al.; Sorlie, T. et al.; Sorlie, T. et al.; 
Sotiriou, C. et al.; and van de Vijver, M.J. et al. N Engl J Med 2002). Patients with ER+ early-stage 

15 breast cancer treated with tamoxifen alone, such as the cohort studied here, represent only a subset 
of the population tested with the 70-gene signature. Of note, 61 of the genes in the 70-gene 
signature were present on the microarray used as described below, but no significant association 
with clinical outcome was observed in the defined subset of patients. 

In comparison with existing biomarkers, including ESR1, PGR, ERBB2 and EGFR, three 

20 sets of gene sequences disclosed herein are significantly more predictive of responsiveness to TAM 
treatment. Multivariate analysis indicated that these three genes were significant predictors of 
clinical outcome independent of tumor size, nodal status and tumor grade. ER and progesterone 
receptor (PR) expression have been the major clinicopathological predictors for response to TAM. 
However, up to 40% of ER+ tumors fail to respond or develop resistance to TAM. The invention 

25 thus provides for the use of the identified biomarkers to allow better patient management by 

identifying patients who are more likely to benefit from TAM or other endocrine therapy and those 
who are likely to develop resistance and tumor recurrence. 

As noted herein, the sequences(s) identified by the present invention are expressed in 
correlation with ER+ breast cancer cells. For example, IL17BR, identified by I.M.A.G.E. 
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Consortium Clusters NMJH8725 and NM_1 72234 ("The I.M.A.G.E. Consortium: An Integrated 
Molecular Analysis of Genomes and their Expression," Lennon et al., 1996, Genomics 33:151-152; 
see also image.llnl.gov) has been found to be useful in predicting responsiveness to TAM treatment. 
In preferred embodiments of the invention, any sequence, or unique portion thereof, of the 
5 IL1 7BR sequences of the cluster, as well as the UniGene Homo sapiens cluster Hs.5470, may be 
used. Similarly, any sequence encoding all or a part of the protein encoded by any IL17BR 
sequence disclosed herein may be used. Consensus sequences of I.M.A.G.E. Consortium clusters 
are as follows, with the assigned coding region (ending with a termination codon) underlined and 
preceded by the 5' untranslated and/or non-coding region and followed by the 3 5 untranslated 
1 0 and/or non-coding region: 



SEQ ID NO:l (consensus sequence for IL17BR, transcript variant 1, identified as NM_018725 or 
NM_018725.2) 

15 



20 



25 



30 



35 



agcgcagcgt gcgggtggcc 


tggatcccgc 


gcagtggccc 


ggcgatgtcg 


ctcgtgctgc 


taagcctggc cgcgctgtgc 


aggagcgccg 


taccccgaga 


gccgaccgtt 


caatgtggct 


ctgaaactgg gccatctcca 


gagtggatgc 


tacaacatga 


tctaatcccc 


ggagacttga 


gggacctccg agtagaacct 


gttacaacta gtgttgcaac 


aggggactat 


tcaattttga 


tgaatgtaag ctgggtactc 


cgggcagatg 


ccagcatccg 


cttgttgaag gccaccaaga 


tttgtgtgac gggcaaaagc 


aacttccagt 


cctacagctg 


tgtgaggtgc 


aattacacag 


aggccttcca gactcagacc 


agaccctctg gtggtaaatg gacattttcc 


tacatcggct 


tccctgtaga gctgaacaca 


gtctatttca 


ttggggccca 


taatattcct 


aatgcaaata 


tgaatgaaga tggcccttcc 


atgtctgtga 


atttcacctc 


accaggctgc 


ctagaccaca 


taatgaaata taaaaaaaag 


tgtgtcaagg 


ccggaagcct 


gtgggatccg 


aacatcactg 


cttgtaagaa gaatgaggag 


acagtagaag 


tgaacttcac 


aaccactccc 


ctgggaaaca 


gatacatggc tcttatccaa 


cacagcacta 


tcatcgggtt 


ttctcaggtg 


tttgagccac 


accagaagaa acaaacgcga gcttcagtgg 


tgattccagt 


gactggggat 


agtgaaggtg 


ctacggtgca gctgactcca 


tattttccta 


cttgtggcag 


cgactgcatc 


cgacataaag 


gaacagttgt gctctgccca 


caaacaggcg 


tccctttccc 


tctggataac 


aacaaaagca 


agccgggagg ctggctgcct 


ctcctcctgc 


tgtctctgct 


ggtggccaca 


tgggtgctgg 


tggcagggat ctatctaatg 


tggaggcacg aaaggatcaa 


gaagacttcc 


ttttctacca 


ccacactact gccccccatt 


aaggttcttg 


tggtttaccc 


atctgaaata 


tgtttccatc 


acacaatttg ttacttcact 


gaatttcttc 


aaaaccattg 


cagaagtgag gtcatccttg 


aaaagtggca gaaaaagaaa 


atagcagaga 


tgggtccagt 


gcagtggctt 


gccactcaaa 


agaaggcagc agacaaagtc 


gtcttccttc 


tttccaatga 


cgtcaacagt 


gtgtgcgatg 


gtacctgtgg caagagcgag ggcagtccca 


gtgagaactc 


tcaagacctc 


ttcccccttg 


cctttaacct tttctgcagt 


gatctaagaa gccagattca 


tctgcacaaa 


tacgtggtgg 


tctactttag agagattgat 


acaaaagacg 


attacaatgc 


tctcagtgtc 


tgccccaagt 
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accacctcat gaaggatgcc actgctttct gtgcagaact tctccatgtc aagcagcagg 
tgtcagcagg aaaaagatca caagcctgcc acgatggctg ctgctccttg tag cccaccc 
atgagaagca agagacctta aaggcttcct atcccaccaa ttacagggaa aaaacgtgtg 
atgatcctga agcttactat gcagcctaca aacagcctta gtaattaaaa cattttatac 
5 caataaaatt ttcaaatatt gctaactaat gtagcattaa ctaacgattg gaaactacat 
ttacaacttc aaagctgttt tatacataga aatcaattac agttttaatt gaaaactata 
accattttga taatgcaaca ataaagcatc ttcagccaaa catctagtct tccatagacc 
atgcattgca gtgtacccag aactgtttag ctaatattct atgtttaatt aatgaatact 
aactctaaga acccctcact gattcactca atagcatctt aagtgaaaaa ccttctatta 
10 catgcaaaaa atcattgttt ttaagataac aaaagtaggg aataaacaag ctgaacccac 
ttttaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaa 



SEQ ID NO:2 (consensus sequence for IL17BR, transcript variant 2, identified as NM_1 72234 or 
NM_1 72234.1) 



15 



agcgcagcgt gcgggtggcc tggatcccgc gcagtggccc ggcg atgtcg ctcgtgctgc 
taagcctggc cgcgctgtgc aggagcgccg taccccgaga gccgaccgtt caatgtggct 
ctgaaactgg gccatctcca gagtggatgc tacaacatga tctaatcccc ggagacttga 
gggacctccg agtagaacct gttacaacta gtgttgcaac aggggactat tcaattttga 

20 tgaatgtaag ctgggtactc cgggcagatg ccagcatccg cttgttgaag gccaccaaga 
tttgtgtgac gggcaaaagc aacttccagt cctacagctg tgtgaggtgc aattacacag 
aggccttcca gactcagacc agaccctctg gtggtaaatg gacattttcc tacatcggct 
tccctgtaga gctgaacaca gtctatttca ttggggccca taatattcct aatgcaaata 
tgaatgaaga tggcccttcc atgtctgtga atttcacctc accaggctgc ctagaccaca 

25 taatgaaata taaaaaaaag tgtgtcaagg ccggaagcct gtgggatccg aacatcactg 
cttgtaagaa gaatgaggag acagtagaag tgaacttcac aaccactccc ctgggaaaca 
gatacatggc tcttatccaa cacagcacta tcatcgggtt ttctcaggtg tttgagccac 
accagaagaa acaaacgcga gcttcagtgg tgattccagt gactggggat agtgaaggtg 
ctacggtgca ggtaaagttc agtgagctgc tctggggagg gaagggacat agaagactgt 

30 tccatcattc attgctttta aggatgagtt ctctcttgtc aaatgcactt ctgccagcag 
acaccagtta a gtggcgttc atgggggctc tttcgctgca gcctccaccg tgctgaggtc 
aggaggccga cgtggcagtt gtggtccctt ttgcttgtat taatggctgc tgaccttcca 
aagcactttt tattttcatt ttctgtcaca gacactcagg gatagcagta ccattttact 
tccgcaagcc tttaactgca agatgaagct gcaaagggtt tgaaatggga aggtttgagt 

35 tccaggcagc gtatgaactc tggagagggg ctgccagtcc tctctgggcc gcagcggacc 
cagctggaac acaggaagtt ggagcagtag gtgctccttc acctctcagt atgtctcttt 
caactctagt ttttgaggtg gggacacagg aggtccagtg ggacacagcc actccccaaa 
gagtaaggag cttccatgct tcattccctg gcataaaaag tgctcaaaca caccagaggg 
ggcaggcacc agccagggta tgatggctac tacccttttc tggagaacca tagacttccc 

40 ttactacagg gacttgcatg tcctaaagca ctggctgaag gaagccaaga ggatcactgc 
tgctcctttt ttctagagga aatgtttgtc tacgtggtaa gatatgacct agccctttta 
ggtaagcgaa ctggtatgtt agtaacgtgt acaaagttta ggttcagacc ccgggagtct 
tgggcacgtg ggtctcgggt cactggtttt gactttaggg ctttgttaca gatgtgtgac 
caaggggaaa atgtgcatga caacactaga ggtatgggcg aagccagaaa gaagggaagt 
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tttggctgaa gtaggagtct tggtgagatt ttgctctgat gcatggtgtg aactttctga 

gcctcttgtt tttcctcagc tgactccata ttttcctact tgtggcagcg actgcatccg 

acataaagga acagttgtgc tctgcccaca aacaggcgtc cctttccctc tggataacaa 

caaaagcaag ccgggaggct ggctgcctct cctcctgctg tctctgctgg tggccacatg 

5 ggtgctggtg gcagggatct atctaatgtg gaggcacgaa aggatcaaga agacttcctt 

ttctaccacc acactactgc cccccattaa ggttcttgtg gtttacccat ctgaaatatg 

tttccatcac acaatttgtt acttcactga atttcttcaa aaccattgca gaagtgaggt 

catccttgaa aagtggcaga aaaagaaaat agcagagatg ggtccagtgc agtggcttgc 

cactcaaaag aaggcagcag acaaagtcgt cttccttctt tccaatgacg tcaacagtgt 

10 gtgcgatggt acctgtggca agagcgaggg cagtcccagt gagaactctc aagacctctt 

cccccttgcc tttaaccttt tctgcagtga tctaagaagc cagattcatc tgcacaaata 

cgtggtggtc tactttagag agattgatac aaaagacgat tacaatgctc tcagtgtctg 

ccccaagtac cacctcatga aggatgccac tgctttctgt gcagaacttc tccatgtcaa 

gcagcaggtg tcagcaggaa aaagatcaca agcctgccac gatggctgct gctccttgta 

15 gcccacccat gagaagcaag agaccttaaa ggcttcctat cccaccaatt acagggaaaa 

aacgtgtgat gatcctgaag cttactatgc agcctacaaa cagccttagt aattaaaaca 

ttttatacca ataaaatttt caaatattgc taactaatgt agcattaact aacgattgga 

aactacattt acaacttcaa agctgtttta tacatagaaa tcaattacag ttttaattga 

aaactataac cattttgata atgcaacaat aaagcatctt cagccaaaca tctagtcttc 

20 catagaccat gcattgcagt gtacccagaa ctgtttagct aatattctat gtttaattaa 

tgaatactaa ctctaagaac ccctcactga ttcactcaat agcatcttaa gtgaaaaacc 

ttctattaca tgcaaaaaat cattgttttt aagataacaa aagtagggaa taaacaagct 

gaacccactt ttaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaa 

25 I.M.A.G.E. Consortium Clone ID numbers and the corresponding GenBank accession 

numbers of sequences identified as belonging to the I.M.A.G.E. Consortium and UniGene clusters, 
are listed below. Also included are sequences that are not identified as having a Clone ID number 
but still identified as being those of IL17BR. The sequences include those of the "sense" and 
complementary strands sequences corresponding to IL17BR. The sequence of each GenBank 

30 accession number is presented in the attached Appendix. 



Clone ID numbers 


GenBank accession numbers 


2985728 


AW675096, AW673932, BC000980 


5286745 


BI602183 


5278067 


BI458542 


5182255 


BI823321 


924000 


AA5 14396 


3566736 


BF 110326 
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3195409 


BE466508 




3576775 


BF740045 




2772915 


AW299271 




1368826 


AA836217 




1744837 


AI203628 


2285564 


AI627783 


• - - - ' 


2217709 


AI744263 


..... 


2103651 


AI401622 


- - - -■- — -i 


2419487 i 


AI826949 


. _.. ■ - .-. - 


3125592 


BE047352 




2284721 


AI9 11549 




3643302 


BF1 94822 




1646910 


AI034244 




1647001 


AI033911 




3323709 


BF064177 




1419779 


AA847767 




2205190 


AI538624 




2295838 


AI913613 




2461335 


AI942234 




2130362 


AI580483 




2385555 


AI831909 


2283817 


AI672344 




2525596 


AW025192 




454687 


AA677205 


1285273 


AA721647 


3134106 


BF115018 




342259 


W61238, W61239 j 


1651991 


AI032064 


2687714 


AW236941 


3302808 


BG057174 




2544461 


AW058532 




122014 


T98360, T98361 


2139250 


AI470845 
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2133899 


AI497731 


121300 


T96629, T96740 


162274 


H25975, H25941 


3446667 


BE539514, BX282554 


156864 


R74038, R74129 


4611491 


BG433769 


4697316 


BG530489 


429376 


AA007528, AA007529 


5112415 


BI260259 


701357 


AA287951, AA287911 


121909 


T97852, T97745 


268037 


N40294 


1307489 


AA809841 


1357543 


AA832389 


48442 


HI 4692 


1302619 


AA732635 


1562857 


AA928257 


1731938 AI184427 


1896025 


AI298577 


2336350 


AI692717 


1520997 


AA910922 


240506 


H90761 


2258560 


AI620122 


1569921 


AI793318, AA962325, AI733290 


6064627 


BQ226353 


299018 


W04890 


5500181 


BM455231 


248401 1 


BI492426 


4746376 


BG674622 


233783 


BX1 11256 


1569921 


BX1 17618 


450450 


AA682806 


1943085 


AI202376 
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2250390 



4526156 



4556884 



3249181 



2484395 



30515867 



2878155 



3254505 



3650593 



233783 



None (mRNA sequences) 



None 



AI658949 

BG403405 

BE673417 

AW021469 

CF455736 

AW339874 

BG399724 

BF475787 

BF437145 

H64601 

' AF2 12365, AF2081 10, AF2081 11, AF250309, 
AK095091 



BM983744, CB305764, BM715988, BM670929, 
BI792416, BI715216, N56060, CB241389, 
AV660618, BX088671, CB154426, CA434589, 
ICA412162, CA314073, BF921554, BF920093, 
AV685699, AV650175, BX483104, CD675121, 
BE081436, AW970151, AW837146, 
AW368264, D25960, AV709899, BX431018, 
AL535617, AL525465, BX453536, BX453537, 
AV728945, AV728939, AV727345 



In one preferred embodiment, any sequence, or unique portion thereof, of the following 
IL17BR sequence, identified by AF2081 1 1 or AF2081 1 1.1, may be used in the practice of the 
invention. 



SEQ ED NO:3 (sequence for IL17BR): 



CGGCGATGTCGCTCGTGCTGATAAGCCTGGCCGCGCTGTGCAGGAGCGCCGTACCCCGAG 
AGCCGACCGTTCAATGTGGCTCTGAAACTGGGCCATCTCCAGAGTGGATGCTACAACATG 

1 0 ATCTAATCCCCGGAGACTTGAGGGACCTCCGAGTAGAACCTGTTACAACTAGTGTTGCAA 
CAGGGGACTATTCAATTTTGATGAATGTAAGCTGGGTACTCCGGGCAGATGCCAGCATCC 
GCTTGTTGAAGGCCACCAAGATTTGTGTGACGGGCAAAAGCAACTTCCAGTCCTACAGCT 
GTGTGAGGTGCAATTACACAGAGGCCTTCCAGACTCAGACCAGACCCTCTGGTGGTAAAT 
GGACATTTTCCTATATCGGCTTCCCTGTAGAGCTGAACACAGTCTATTTCATTGGGGCCC 

1 5 ATAATATTCCTAATGCAAATATGAATGAAGATGGCCCTTCCATGTCTGTGAATTTCACCT 
CACCAGGCTGCCTAGACCACATAATGAAATATAAAAAAAAGTGTGTCAAGGCCGGAAGCC 
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TGTGGGATCCGAACATCACTGCTTGTAAGAAGAATGAGGAGACAGTAGAAGTGAACTTCA 
CAACCACTCCCCTGGGAAACAGATACATGGCTCTTATCCAACACAGCACTATCATCGGGT 
TTTCTCAGGTGTTTGAGCCACACCAGAAGAAACAAACGCGAGCTTCAGTGGTGATTCCAG 
TGACTGGGGATAGTGAAGGTGCTACGGTGCAGGTAAAGTTCAGTGAGCTGCTCTGGGGAG 
5 GGAAGGGACATAGAAGACTGTTCCATCATTCATTGCTTTTAAGGATGAGTTCTCTCTTGT 
CAAATGCACTTCTGCCAGCAGACACCAGTTAAGTGGCGTTCATGGGGGTTCTTTCGCTGC 
AGCCTCCACCGTGCTGAGGTCAGGAGGCCGACGTGGCAGTTGTGGTCCCTTTTGCTTGTA 
TTAATGGCTGCTGACCTTCCAAAGCACTTTTTATTTTCATTTTCTGTCACAGACACTCAG 
GGATAGCAGTACCATTTTACTTCCGCAAGCCTTTAACTGCAAGATGAAGCTGCAAAGGGT 

10 TTGAAATGGGAAGGTTTGAGTTCCAGGCAGCGTATGAACTCTGGAGAGGGGCTGCCAGTC 
CTCTCTGGGCCGCAGCGGACCCAGCTGGAACACAGGAAGTTGGAGCAGTAGGTGCTCCTT 
CACCTCTCAGTATGTCTCTTTCAACTCTAGTTTTTGAAGTGGGGACACAGGAAGTCCAGT 
GGGGACACAGCCACTCCCCAAAGAATAAGGAACTTCCATGCTTCATTCCCTGGCATAAAA 
AGTGNTCAAACACACCAGAGGGGGCAGGCACCAGCCAGGGTATGATGGGTACTACCCTTT 

1 5 TCTGGAGAACCATAGACTTCCCTTACTACAGGGACTTGCATGTCCTAAAGCACTGGCTGA 
AGGAAGCCAAGAGGATCACTGCTGCTCCTTTTTTGTAGAGGAAATGTTTGTGTACGTGGT 
AAGATATGACCTAGCCCTTTTAGGTAAGCGAACTGGTATGTTAGTAACGTGTACAAAGTT 
TAGGTTCAGACCCCGGGAGTCTTGGGCATGTGGGTCTCGGGTCACTGGTTTTGACTTTAG 
GGCTTTGTTACAGATGTGTGACCAAGGGGAAAATGTGCATGACAACACTAGAGGTAGGGG 

20 CGAAGCCAGAAAGAAGGGAAGTTTTGGCTGAAGTAGGAGTCTTGGTGAGATTTTGCTGTG 
ATGCATGGTGTGAACTTTCTGAGCCTCTTGTTTTTCCTCAGCTGACTCCATATTTTCCTA 
CTTGTGGCAGCGACTGCATCCGACATAAAGGAACAGTTGTGCTCTGCCCACAAACAGGCG 
TCCCTTTCCCTCTGGATAACAACAAAAGCAAGCCGGGAGGCTGGCTGCCTCTCCTCCTGC 
TGTCTCTGCTGGTGGCCACATGGGTGCTGGTGGCAGGGATCTATCTAATGTGGAGGCACG 

25 AAAGGATCAAGAAGACTTCCTTTTCTACCACCACACTACTGCCCCCCATTAAGGTTCTTG 
TGGTTTACCCATCTGAAATATGTTTCCATCACACAATTTGTTACTTCACTGAATTTCTTC 
AAAACCATTGCAGAAGTGAGGTCATCCTTGAAAAGTGGCAGAAAAAGAAAATAGCAGAGA 
TGGGTCCAGTGCAGTGGCTTGCCACTCAAAAGAAGGCAGCAGACAAAGTCGTCTTCCTTC 
TTTCCAATGACGTCAACAGTGTGTGCGATGGTACCTGTGGCAAGAGCGAGGGCAGTCCCA 

30 GTGAGAACTCTCAAGACCTCTTCCCCCTTGCCTTTAACCTTTTCTGCAGTGATCTAAGAA 
GCCAGATTCATCTGCACAAATACGTGGTGGTCTACTTTAGAGAGATTGATACAAAAGACG 
ATTACAATGCTCTCAGTGTCTGCCCCAAGTACCACTTCATGAAGGATGCCACTGCTTTCT 
GTGCAGAACTTCTCCATGTCAAGCAGCAGGTGTCAGCAGGAAAAAGATCACAAGCCTGCC 
ACGATGGCTGCTGCTCCTTGTAGCCCACCCATGAGAAGCAAGAGACCTTAAAGGCTTCCT 

35 ATCCCACCAATTACAGGGAAAAAACGTGTGATGATCCTGAAGCTTACTATGCAGCCTACA 
AACAGCCTTAGTAATTAAAACATTTTATACCAATAAAATTTTCAAATATTACTAACTAAT 
GTAGCATTAACTAACGATTGGAAACTACATTTACAACTTCAAAGCTGTTTTATACATAGA 
AATCAATTACAGCTTTAATTGAAAACTGTAACCATTTTGATAATGCAACAATAAAGCATC 
TTCCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 

40 

In another set of preferred embodiments of the invention, any sequence, or unique portion 
thereof, of the CACNA1D sequences of the I.M.A.G.E. Consortium cluster NM_000720, as well as 
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the UniGene Homo sapiens cluster Hs.399966, may be used. Similarly, any sequence encoding all 
or a part of the protein encoded by any CACNA1D sequence disclosed herein may be used. The 
consensus sequence of the I.M.A.G.E. Consortium cluster is as follows, with the assigned coding 
region (ending with a termination codon) underlined and preceded by the 5' untranslated and/or 
5 non-coding region and followed by the 3' untranslated and/or non-coding region: 

SEQ ID NO:4 (consensus sequence for CACNA1D, identified as NM_000720 or NM_000720.1) 





agaataaggg 


cagggaccgc 


ggctcctatc 


tcttggtgat 


ccccttcccc 


attccgcccc 


1 n 


cgcctcaacg 


cccagcacag 


tgccctgcac 


acagtagtcg 


ctcaataaat 


gttcgtggat 




gatgatgatg 


atgatgatga aaaaaatgca 


gcatcaacgg 


cagcagcaag 


cggaccacgc 




gaacgaggca 


aactatgcaa 


gaggcaccag 


acttcctctt 


tctggtgaag gaccaacttc 




tcagccgaat 


agctccaagc 


aaactgtcct 


gtcttggcaa gctgcaatcg atgctgctag 




acaggccaag gctgcccaaa 


ctatgagcac 


ctctgcaccc 


ccacctgtag gatctctctc 


i ^ 
i 


ccaaagaaaa 


cgtcagcaat 


acgccaagag caaaaaacag ggtaactcgt 


ccaacagccg 




acctgcccgc 


gcccttttct 


gtttatcact 


caataacccc 


atccgaagag 


cctgcattag 




tatagtggaa 


tggaaaccat 


ttgacatatt 


tatattattg gctatttttg 


ccaattgtgt 




ggccttagct 


atttacatcc 


cattccctga 


agatgattct 


aattcaacaa 


atcataactt 




ggaaaaagta 


gaatatgcct 


tcctgattat 


ttttacagtc gagacatttt 


tgaagattat 




agcgtatgga 


ttattgctac 


atcctaatgc 


ttatgttagg 


aatggatgga 


atttactgga 




ttttgttata gtaatagtag gattgtttag 


tgtaattttg gaacaattaa 


ccaaagaaac 




agaaggcggg 


aaccactcaa 


gcggcaaatc 


tggaggcttt 


gatgtcaaag 


ccctccgtgc 




ctttcgagtg 


ttgcgaccac 


ttcgactagt 


gtcaggggtg cccagtttac 


aagttgtcct 




gaactccatt 


ataaaagcca 


tggttcccct 


ccttcacata 


gcccttttgg 


tattatttgt 


25 


aatcataatc 


tatgctatta 


taggattgga 


actttttatt 


ggaaaaatgc 


acaaaacatg 




tttttttgct 


gactcagata 


tcgtagctga 


agaggaccca 


gctccatgtg 


cgttctcagg 




gaatggacgc 


cagtgtactg 


ccaatggcac 


ggaatgtagg 


agtggctggg 


ttggcccgaa 




cggaggcatc 


accaactttg ataactttgc 


ctttgccatg 


cttactgtgt 


ttcagtgcat 




caccatggag 


ggctggacag 


acgtgctcta 


ctgggtaaat 


gatgcgatag 


gatgggaatg 


30 


gccatgggtg 


tattttgtta gtctgatcat 


ccttggctca 


tttttcgtcc 


ttaacctggt 




tcttggtgtc 


cttagtggag 


aattctcaaa 


ggaaagagag 


aaggcaaaag 


cacggggaga 




tttccagaag 


ctccgggaga 


agcagcagct 


ggaggaggat 


ctaaagggct 


acttggattg 




gatcacccaa 


gctgaggaca 


tcgatccgga 


gaatgaggaa 


gaaggaggag 


aggaaggcaa 




acgaaatact 


agcatgccca 


ccagcgagac 


tgagtctgtg 


aacacagaga 


acgtcagcgg 


35 


tgaaggcgag 


aaccgaggct 


gctgtggaag 


tctctggtgc 


tggtggagac ggagaggcgc 




ggccaaggcg gggccctctg ggtgtcggcg gtggggtcaa 


gccatctcaa 


aatccaaact 




cagccgacgc 


tggcgtcgct 


ggaaccgatt 


caatcgcaga 


agatgtaggg 


ccgccgtgaa 




gtctgtcacg 


ttttactggc 


tggttatcgt 


cctggtgttt 


ctgaacacct 


taaccatttc 




ctctgagcac 


tacaatcagc 


cagattggtt 


gacacagatt 


caagatattg 


ccaacaaagt 


40 


cctcttggct 


ctgttcacct gcgagatgct 


ggtaaaaatg 


tacagcttgg gcctccaagc 




atatttcgtc 


tctcttttca 


accggtttga 


ttgcttcgtg gtgtgtggtg gaatcactga 
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gacgatcctg 


gtggaactgg 


aaatcatgtc tcccctgggg atctctgtgt 


ttcggtgtgt 




gcgcctctta 


agaatcttca 


aagtgaccag gcactggact 


tccctgagca 


acttagtggc 




atccttatta 


aactccatga 


agtccatcgc ttcgctgttg 


cttctgcttt 


ttctcttcat 




tatcatcttt 


tccttgcttg ggatgcagct gtttggcggc 


aagtttaatt 


ttgatgaaac 


5 


gcaaaccaag 


cggagcacct 


ttgacaattt ccctcaagca 


cttctcacag 


tgttccagat 




cctgacaggc 


gaagactgga 


atgctgtgat gtacgatggc 


atcatggctt 


acgggggccc 




atcctcttca 


ggaatgatcg 


tctgcatcta cttcatcatc 


ctcttcattt 


gtggtaacta 




tattctactg 


aatgtcttct 


tggccatcgc tgtagacaat 


ttggctgatg 


ctgaaagtct 




gaacactgct 


cagaaagaag 


aagcggaaga aaaggagagg 


aaaaagattg 


ccagaaaaga 


10 


gagcctagaa 


aataaaaaga 


acaacaaacc agaagtcaac 


cagatagcca 


acagtgacaa 




caaggttaca 


attgatgact 


atagagaaga ggatgaagac 


aaggacccct 


atccgccttg 




cgatgtgcca gtaggggaag aggaagagga agaggaggag gatgaacctg 


aggttcctgc 




cggaccccgt 


cctcgaagga 


tctcggagtt gaacatgaag gaaaaaattg 


cccccatccc 




tgaagggagc 


gctttcttca 


ttcttagcaa gaccaacccg 


atccgcgtag gctgccacaa 


15 


gctcatcaac 


caccacatct 


tcaccaacct catccttgtc 


ttcatcatgc 


tgagcagcgc 




tgccctggcc 


gcagaggacc 


ccatccgcag ccactccttc 


cggaacacga 


tactgggtta 




ctttgactat 


gccttcacag 


ccatctttac tgttgagatc 


ctgttgaaga 


tgacaacttt 




tggagctttc 


ctccacaaag 


gggccttctg caggaactac 


ttcaatttgc 


tggatatgct 




ggtggttggg 


gtgtctctgg 


tgtcatttgg gattcaatcc 


agtgccatct 


ccgttgtgaa 


20 


gattctgagg 


gtcttaaggg 


tcctgcgtcc cctcagggcc 


atcaacagag 


caaaaggact 




taagcacgtg gtccagtgcg 


tcttcgtggc catccggacc 


atcggcaaca 


tcatgatcgt 




cactaccctc 


ctgcagttca 


tgtttgcctg tatcggggtc 


cagttgttca 


aggggaagtt 




ctatcgctgt 


acggatgaag 


ccaaaagtaa ccctgaagaa 


tgcaggggac 


ttttcatcct 




ctacaaggat 


ggggatgttg 


acagtcctgt ggtccgtgaa 


cggatctggc 


aaaacagtga 


25 


tttcaacttc 


gacaacgtcc 


tctctgctat gatggcgctc 


ttcacagtct 


ccacgtttga 




gggctggcct 


gcgttgctgt 


ataaagccat cgactcgaat 


ggagagaaca 


tcggcccaat 




ctacaaccac 


cgcgtggaga 


tctccatctt cttcatcatc 


tacatcatca 


ttgtagcttt 




cttcatgatg 


aacatctttg 


tgggctttgt catcgttaca 


tttcaggaac 


aaggagaaaa 




agagtataag aactgtgagc 


tggacaaaaa tcagcgtcag 


tgtgttgaat 


acgccttgaa 


30 


agcacgtccc 


ttgcggagat 


acatccccaa aaacccctac 


cagtacaagt 


tctggtacgt 




ggtgaactct 


tcgcctttcg 


aatacatgat gtttgtcctc 


atcatgctca 


acacactctg 




cttggccatg cagcactacg 


agcagtccaa gatgttcaat 


gatgccatgg acattctgaa 




catggtcttc 


accggggtgt 


tcaccgtcga gatggttttg aaagtcatcg 


catttaagcc 




taaggggtat 


tttagtgacg 


cctggaacac gtttgactcc 


ctcatcgtaa 


tcggcagcat 


35 


tatagacgtg gccctcagcg 


aagcggaccc aactgaaagt 


gaaaatgtcc 


ctgtcccaac 




tgctacacct 


gggaactctg 


aagagagcaa tagaatctcc 


atcacctttt 


tccgtctttt 




ccgagtgatg cgattggtga 


agcttctcag caggggggaa ggcatccgga 


cattgctgtg 




gacttttatt 


aagtcctttc 


aggcgctccc gtatgtggcc 


ctcctcatag 


ccatgctgtt 




cttcatctat 


gcggtcattg gcatgcagat gtttgggaaa 


gttgccatga gagataacaa 


40 


ccagatcaat 


aggaacaata 


acttccagac gtttccccag gcggtgctgc 


tgctcttcag 




gtgtgcaaca ggtgaggcct 


ggcaggagat catgctggcc 


tgtctcccag ggaagctctg 




tgaccctgag 


tcagattaca 


accccgggga ggagtataca 


tgtgggagca 


actttgccat 




tgtctatttc 


atcagttttt 


acatgctctg tgcatttctg 


atcatcaatc 


tgtttgtggc 




tgtcatcatg gataatttcg actatctgac ccgggactgg 


tctattttgg ggcctcacca 


45 


tttagatgaa 


ttcaaaagaa 


tatggtcaga atatgaccct gaggcaaagg gaaggataaa 
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acaccttgat gtggtcactc tgcttcgacg cabccagcct cccctggggt ttgggaagtt 



atgtccacac 


agggtagcgt 


gcaagagatt 


agttgccatg 


aacatgcctc 


tcaacagtga 


cgggacagtc 


atgtttaatg caaccctgtt 


tgctttggtt 


cgaacggctc 


ttaagatcaa 


gaccgaaggg 


aacctggagc 


aagctaatga 


agaacttcgg gctgtgataa 


agaaaatttg 


gaagaaaacc 


agcatgaaat 


tacttgacca 


agttgtccct 


ccagctggtg 


atgatgaggt 


aaccgtgggg 


aagttctatg 


ccactttcct 


gatacaggac 


tactttagga aattcaagaa 


acggaaagaa 


caaggactgg 


tgggaaagta 


ccctgcgaag 


aacaccacaa 


ttgccctaca 


ggcgggatta 


aggacactgc 


atgacattgg gccagaaatc 


cggcgtgcta 


tatcgtgtga 


tttgcaagat 


gacgagcctg 


aggaaacaaa 


acgagaagaa 


gaagatgatg 


tgttcaaaag 


aaatggtgcc 


ctgcttggaa 


accatgtcaa 


tcatgttaat 


agtgatagga gagattccct 


tcagcagacc 


aataccaccc 


accgtcccct 


gcatgtccaa 


aggccttcaa ttccacctgc 


aagtgatact 


gagaaaccgc 


tgtttcctcc 


agcaggaaat 


tcggtgtgtc 


ataaccatca 


taaccataat 


tccataggaa 


agcaagttcc 


cacctcaaca 


aatgccaatc 


tcaataatgc 


caatatgtcc 


aaagctgccc 


atggaaagcg gcccagcatt 


gggaaccttg agcatgtgtc 


tgaaaatggg 


catcattctt 


cccacaagca 


tgaccgggacf 


cctcagagaa ggtccagtgt 


gaaaagaacc 


cgctattatg 


aaacttacat 


taggtccgac 


tcaggagatg aacagctccc 


aactatttgc 


cgggaagacc 


cagagataca 


tggctatttc 


agggaccccc 


actgcttggg 


ggagcaggag 


tatttcagta gtgaggaatg ctacgaggat 


gacagctcgc 


ccacctggag 


caggcaaaac 


tatggctact 


acagcagata 


cccaggcaga 


aacatcgact 


ctgagaggcc 


ccgaggctac 


catcatcccc 


aaggattctt 


ggaggacgat 


gactcgcccg tttgctatga 


ttcacggaga 


tctccaagga gacgcctact 


acctcccacc 


ccagcatccc 


accggagatc 


ctccttcaac 


tttgagtgcc 


tgcgccggca gagcagccag gaagaggtcc 


cgtcgtctcc 


catcttcccc 


catcgcacgg 


ccctgcctct 


gcatctaatg 


cagcaacaga 


tcatggcagt 


tgccggccta gattcaagta 


aagcccagaa gtactcaccg 


agtcactcga 


cccggtcgtg 


ggccacccct 


ccagcaaccc 


ctccctaccg ggactggaca 


ccgtgctaca 


cccccctgat 


ccaagtggag 


cagtcagagg 


ccctggacca ggtgaacggc 


agcctgccgt 


ccctgcaccg 


cagctcctgg 


tacacagacg 


agcccgacat 


ctcctaccgg 


actttcacac 


cagccagcct 


gactgtcccc 


agcagcttcc 


ggaacaaaaa 


cagcgacaag 


cagaggagtg 


cggacagctt 


ggtggaggca gtcctgatat 


ccgaaggctt 


gggacgctat 


gcaagggacc 


caaaatttgt 


gtcagcaaca 


aaacacgaaa 


tcgctgatgc 


ctgtgacctc 


accatcgacg 


agatggagag 


tgcagccagc 


accctgctta 


atgggaacgt 


gcgtccccga 


gccaacgggg 


atgtgggccc 


cctctcacac 


cggcaggact 


atgagctaca 


ggactttggt 


cctggctaca gcgacgaaga 


gccagaccct 


gggagggatg 


aggaggacct ggcggatgaa atgatatgca 


tcaccacctt 



gtag ccccca 
gaaaagtgcc 
agacttttgt 
ccaagcggtt 
ccccgccctc 
tgggcactgc 
aggcatggcg 
cgttacctca 
ccctttcccc 



gcgaggggca 
tcatagttag 
ataagagatg 
gagcctggca 
tcacagagga 
tgtggagtct 
gcggggtgca 
gccatcggtc 
caaatacact 



gactggctct 
gaaagtttag 
tcatgcctca 
gagtaccatg 
tgggtgagga 
gcttctccca 
ggggaaagtt 
tagcatatca 
gcgtcctggt 



ggcctcaggt 
gcactagttg 
agaaagccat 
cgctcggccc 
ggccagacct 
tgtaccaggg 
aaaggtgatg 
gtcactgggc 
tcctgtttag 



ggggcgcagg 
ggagtaatat 
aaacctggta 
cagctgcagg 
gccctgcccc 
caccaggccc 
acgatcatca 
ccaacatatc 



agagccaggg 
tcaattaatt 
ggaacaggtc 
aaacagcagg 
attgtccaga 
acccaactga 
cacctcgtgt 
catttttaaa 



ctgttctgaa ata 
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LM.A.G.E. Consortium Clone ID numbers and the corresponding GenBank accession 
numbers of sequences identified as belonging to the LM.A.G.E. Consortium and UniGene clusters, 
are listed below. Also included are sequences that are not identified as having a Clone ID number 
but still identified as being those of CACNA1D. The sequences include those of the "sense" and 
5 complementary strands sequences corresponding to CACNA1D. The sequence of each GenBank 
accession number is presented in the attached Appendix. 



Clone ED numbers 


GenBank accession numbers 


5676430 


BM128550 




5197948 


BI755471 




6027638 


BQ549084, BQ549571 


2338956 ; 


AI693324 


36581 


R25307, R46658 


49630 ! 


H29256, H29339 




4798765 


BG7 16371 




2187310 s 


AI537488 


838231 


AA458692 j 


2111614 


AI393327 




2183482 


AI520947 


1851007 


AI248998 


1675503 


AI075844 




2434923 


AI869807 


2434924 


AI869800 




1845827 


AI243110 




2511756 


AI955764 


628568 


AA192669, AA192157 


2019331 


AI361691 




2337381 


AI9 14244 


2503579 


AW008769 


2503626 


AW008794 


1160989 


AA877582 


1653475 


AI051972 
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1627755 


AI0 17959 




287750 


N79331.N62240 


1867677 


AI240933 


1618303 


AI0 15031 




1881344 


AI290994 




1408031 


AA861160 


1557035 


AA9 15941 


956303 


AA493341 




2148234 


AI467998 


1499899 


AA885585 


1647592 


AI033648 


2341185 


AI697633 




981603 


AA523647 


6281678 


BQ710377 


6278348 


BQ706920 


5876024 


BQ016847 




6608849 


CA943595 




5440464 


BM008196 


5209489 


BI769856 




5183025 


BI758971 | 


880540 


AA468565 




757337 


AA437099 




6608849 


CA867864 




461797 


AA682690 




434787 


AA701888 




6151588 


BUI 82632 


6295618 


BQ898429 


6300779 


BQ711800 


434811 j 


AA703120 




1568025 | 


AA978315 


3220210 


BE5 50599 


3214121 


BE502741 




3009312 


AW872382 
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2733394 


A "WIT A A A y" 

AW444663 


2872156 


AW341279 


30514550 


a Am ^ mm mm f\ 

CF456750 


2718456 


AW139850 


2543682 


AW029633 


2492730 


AI963788 


2545866 


AI951788 


2272081 


AI680744 


2152336 


AI601252 


2146429 


AI459166 


1274498 


AA885750 


2272081 


BX092736 


287750 


BX 114568 


3233645 


BE672659 


289209 


N78509, N73668 


277086 


N46744, N39597 


3272340 


BF439267 


3273859 


BF436153 


3568401 


BF1 10611 


None (mRNA sequences) 


M76558, AF088004, M83566 


None 


CB410657, BQ372430, BQ366601, BQ324528, 
BQ318830, AL708030, BM509161, N85902, 
BQ774355, CA774243, CA436347, CA389011, 
BU679327, BU608029, BU073743, BE175413, 
AW969248, AI908115, BF754485, BI015409, 
BG202552, BF883669, BF817590, BF807128, 
BF806160, BF805244, BF805235, BF805080, 
T27949, BE836638, BE770685, BE769065, 



In one preferred embodiment, any sequence, or unique portion thereof, of the following 
CACNA1D sequence, identified by AF088004 or AF088004.1, may be used in the practice of the 
invention. 

SEQ ID NO:5 (sequence for CACNA1D): 
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TTTTTTTTTTTTTTTTTTTTTCTTACAAAGAAAAATTTAATATTCGATGAGAGGTTGAAC 
CAGGCTTAAAGCAGACATACTAGGAAATGGTGCAGCCTGTAAGAATGCCAGTTTGTAAGT 
ACTGACTTTGGAAAAGATCATCGCCTCTATCAGACACTTAGGGTCCTGGTCTGGCAATTT 
5 TGGCCTGATGTGATGCCACAAGACCCAACAGAGAGAGACACAGAGTCCAGGATAATGTTG 
ACAGTGGTGTAGCCCTTTAGGAGAAATGGCGCTCCCTGCGGCTGGTATTAGGTTACCATT 
GGCACCGAAGGAACCAGGAGGATAAGAATATCCATAATTTCAGAGCTGCCCTGGCACAGT 
ACCTGCCCCGTCGGAGGCTCTCACTGGCAAATGACAGCTCTGTGCAAGGAGCACTCCCAA 
GTATAAAAATTATTACACAGTTTTATTCTGAAGAACATTTTGCATTTTAATAAAAAAGGA 
10 TTTATGTCAGGAAAGAGTCATTTACAAACCTTGAAGTGTTTTTGCCTGGATCAGAGTAAG 
AATGTCTTAAGAAGAGGTTTGTAAGGTCTTCATAACAAAGTGGTGTTTGTTATTTACAAA 
AAAAAAAAAAAAAAAAATTAACAGGTTGTCTGTATACTATTAAAAATTTTGGACCAAAAA 
AAAAAAAAAAAAAAA 

15 In another set of preferred embodiments of the invention, any sequence, or unique portion 

thereof, of the HOXB13 sequences of the I.M.A.G.E. Consortium cluster NM_006361, as well as 
the UniGene Homo sapiens cluster Hs.66731, may be used. Similarly, any sequence encoding all or 
a part of the protein encoded by any HOXB13 sequence disclosed herein may be used. The 
consensus sequence of the I.M.A.G.E. Consortium cluster is as follows, with the assigned coding 

20 region (ending with a termination codon) underlined and preceded by the 5 5 untranslated and/or 
non-coding region and followed by the 3' untranslated and/or non-coding region: 



SEQ ID NO:6 (consensus sequence for HOXB13, identified as NMJ)06361or NM_006361.2) 



25 cgaatgcagg cgacttgcga gctgggagcg atttaaaacg ctttggattc ccccggcctg 
ggtggggaga gcgagctggg tgccccctag attccccgcc cccgcacctc atgagccgac 
cctcggctcc atggagcccg gcaattatgc caccttggat ggagccaagg atatcgaagg 



30 



35 



agcggcgcct 


acgctgatgc 


ctgctgtcaa 


ctatgccccc ttggatctgc 


caggctcggc 


ggagccgcca 


aagcaatgcc 


acccatgccc 


tggggtgccc caggggacgt 


ccccagctcc 


cgtgccttat 


ggttactttg 


gaggcgggta 


ctactcctgc cgagtgtccc 


ggagctcgct 


gaaaccctgt 


gcccaggcag 


ccaccctggc 


cgcgtacccc gcggagactc 


ccacggccgg 


ggaagagtac 


cccagtcgcc 


ccactgagtt 


tgccttctat ccgggatatc 


cgggaaccta 


ccacgctatg gccagttacc 


tggacgtgtc 


tgtggtgcag actctgggtg 


ctcctggaga 


accgcgacat 


gactccctgt 


tgcctgtgga 


cagttaccag tcttgggctc 


tcgctggtgg 


ctggaacagc 


cagatgtgtt 


gccagggaga 


acagaaccca ccaggtccct 


tttggaaggc 


agcatttgca 


gactccagcg 


ggcagcaccc 


tcctgacgcc tgcgcctttc 


gtcgcggccg 


caagaaacgc attccgtaca gcaaggggca gttgcgggag ctggagcggg agtatgcggc 


taacaagttc 


atcaccaagg acaagaggcg caagatctcg gcagccacca gcctctcgga 
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gcgccagatt accatctggt ttcagaaccg ccgggtcaaa gagaagaagg ttctcgccaa 
ggtgaagaac agcgctaccc ctta agagat ctccttgcct gggtgggagg agcgaaagtg 
ggggtgtcct ggggagacca gaaacctgcc aagcccaggc tggggccaag gactctgctg 
agaggcccct agagacaaca cccttcccag gccactggct gctggactgt tcctcaggag 
5 cggcctgggt acccagtatg tgcagggaga cggaacccca tgtgacaggc ccactccacc 
agggttccca aagaacctgg cccagtcata atcattcatc ctcacagtgg caataatcac 
gataaccagt 

LM.A.G.E. Consortium Clone ID numbers and the corresponding GenBank accession 
10 numbers of sequences identified as belonging to the LM.A.G.E. Consortium and UniGene clusters, 
are listed below. Also included are sequences that are not identified as having a Clone ED number 
but still identified as being those of HOXB13. The sequences include those of the "sense" and 
complementary strands sequences corresponding to HOXB13. The sequence of each GenBank 
accession number is presented in the attached Appendix. 

15 



Clone ID numbers 


GenBank accession numbers 


4250486 


BF676461, BC007092 


5518335 


BM462617 


4874541 


BG752489 


4806039 


BG778198 


3272315 


CB050884, CB050885 


4356740 


BF965191 


6668163 


BU930208 


1218366 


AA807966 


2437746 


AI884491 


1187697 


AA652388 


3647557 


BF446158 


1207949 


AA657924 


1047774 


AA644637 


3649397 


BF222357 


971664 


AA527613 


996191 


AA533227 


813481 


AA456069, AA455572, BX1 17624 


6256333 


BQ673782 
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2408470 


A TO 1 A A C 

AI814453 


2114743 


AI417272 


998548 


AA535663 


2116027 


AI400493 


3040843 


AW779219 


1101311 


A A F t\ A f\ A ^ 

AA594847 


1752062 


All 50430 


898712 


A A A f\ A ^ C) *1 

AA494387 


1218874 


a a /■* y/^ y a ^ 

AA662643 


2460189 


AI935940 


986283 


AA532530 


1435135 


AA857572 


1871750 


AI261980 


3915135 


BE888751 


2069668 


AI378797 


667188 ! 


AA234220, AA236353 


1101561 


AA588193 


1170268 


AI821103, AI821851, AA635855 


2095067 


AI420753 


4432770 


BG1 80547 


783296 


AA468306, AA468232 


3271646 


CB050115, CB050116 


1219276 


AA661819 


30570598 


CF146837 


30570517 


CF 146763 


30568921 


CF 144902 


3099071 


CF141511 


3096992 


^XTi -% a^ Af\ a~ y ^ 

CF 1395 63 


3096870 


CF1 39372 


3096623 


CF139319 


3096798 


CF1 39275 


30572408 


CF 122893 


2490082 


AI972423 
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2251055 


AI918975 | 


2419308 


AI826991 


2249105 


AI686312 


2243362 


AI655923 


30570697 


CF 146922 


3255712 


BF476369 


3478356 


BF057410 


3287977 


BE645544 


3287746 


BE645408 


3621499 


BE388501 


30571128 


CF1 47366 


30570954 


CF147143 


None (mRNA sequences) 


BT007410, BC007092, U57052, U81599 


None 


CB1201 19, CB125764, AU098628, CB126130, 
BI023924, BM767063, BM794275, BQ36321 1, 
BM932052, AA357646, AW609525, CB126919, 
AW609336, AW609244, BF855145, AU 1269 14, 
CB126449, AW582404, BX641644 



In one preferred embodiment, any sequence, or unique portion thereof, of the following 
HOXB13 sequence, identified by BC007092 or BC007092.1, may be used in the practice of the 
invention. 



SEQ ID NO:7 (sequence for HOXB13): 

GGATTCCCCCGGCCTGGGTGGGGAGAGCGAGCTGGGTGCCCCCTAGATTCCCCGCCCCCG 
CACCTCATGAGCCGACCCTCGGCTCCATGGAGCCCGGCAATTATGCCACCTTGGATGGAG 

1 0 CCAAGGATATCGAAGGCTTGCTGGGAGCGGGAGGGGGGCGGAATCTGGTCGCCCACTCCC 
CTCTGACCAGCCACCCAGCGGCGCCTACGCTGATGCCTGCTGTCAACTATGCCCCCTTGG 
ATCTGCCAGGCTCGGCGGAGCCGCCAAAGCAATGCCACCCATGCCCTGGGGTGCCCCAGG 
GGACGTCCCCAGCTCCCGTGCCTTATGGTTACTTTGGAGGCGGGTACTACTCCTGCCGAG 
TGTCCCGGAGCTCGCTGAAACCCTGTGCCCAGGCAGCCACCCTGGCCGCGTACCCCGCGG 

1 5 AGACTCCCACGGCCGGGGAAGAGTACCCCAGCCGCCCCACTGAGTTTGCCTTCTATCCGG 
GATATCCGGGAACCTACCAGCCTATGGCCAGTTACCTGGACGTGTCTGTGGTGCAGACTC 
TGGGTGCTCCTGGAGAACCGCGACATGACTCCCTGTTGCCTGTGGACAGTTACCAGTCTT 
GGGCTCTCGCTGGTGGCTGGAACAGCCAGATGTGTTGCCAGGGAGAACAGAACCCACCAG 
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GTCCCTTTTGGAAGGCAGCATTTGCAGACTCCAGCGGGCAGCACCCTCCTGACGCCTGCG 
CCTTTCGTCGCGGCCGCAAGAAACGCATTCCGTACAGCAAGGGGCAGTTGCGGGAGCTGG 
AGCGGGAGTATGCGGCTAACAAGTTCATCACCAAGGACAAGAGGCGCAAGATCTCGGCAG 
CCACCAGCCTCTCGGAGCGCCAGATTACCATCTGGTTTCAGAACCGCCGGGTCAAAGAGA 
5 AGAAGGTTCTCGCCAAGGTGAAGAACAGCGCTACCCCTTAAGAGATCTCCTTGCCTGGGT 
GGGAGGAGCGAAAGTGGGGGTGTCCTGGGGAGACCAGGAACCTGCCAAGCCCAGGCTGGG 
GCCAAGGACTCTGCTGAGAGGCCCCTAGAGACAACACCCTTCCCAGGCCACTGGCTGCTG 
GACTGTTCCTCAGGAGCGGCCTGGGTACCCAGTATGTGCAGGGAGACGGAACCCCATGTG 
ACAGCCCACTCCACCAGGGTTCCCAAAGAACCTGGCCCAGTCATAATCATTCATCCTGAC 
10 AGTGGCAATAATCACGATAACCAGTACTAGCTGCCATGATCGTTAGCCTCATATTTTCTA 
TCTAGAGCTCTGTAGAGCACTTTAGAAACCGCTTTCATGAATTGAGCTAATTATGAATAA 
ATTTGGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 

Sequences identified by SEQ ID NO. are provided using conventional representations of a 

15 DNA strand starting from the 5' phosphate linked end to the 3' hydroxyl linked end. The 

assignment of coding regions is generally by comparison to available consensus sequence(s) and 
therefore may contain inconsistencies relative to other sequences assigned to the same cluster. 
These have no effect on the practice of the invention because the invention can be practiced by use 
of shorter segments (or combinations thereof) of sequences unique to each of the three sets 

20 described above and not affected by inconsistencies. As non-limiting examples, a segment of 
IL17BR, CACNA1D, or HOXB13 nucleic acid sequence composed of a 3' untranslated region 
sequence and/or a sequence from the 3' end of the coding region may be used as a probe for the 
detection of IL17BR, CACNA1D, or HOXB13 expression, respectively, without being affected by 
the presence of any inconsistency in the coding regions due to differences between sequences. 

25 Similarly, the use of an antibody which specifically recognizes IL17BR, CACNA1D, or HOXB13 
protein to detect its expression would not be affected by the presence of any inconsistency in the 
representation of the coding regions provided above. 

As will be appreciated by those skilled in the art, some of the above sequences include 3' 
poly A (or poly T on the complementary strand) stretches that do not contribute to the uniqueness of 

30 the disclosed sequences. The invention may thus be practiced with sequences lacking the 3' poly A 
(or poly T) stretches. The uniqueness of the disclosed sequences refers to the portions or entireties 
of the sequences which are found only in IL17BR, CACNA1D, or HOXB13 nucleic acids, 
including unique sequences found at the 3' untranslated portion of the genes. Preferred unique 
sequences for the practice of the invention are those which contribute to the consensus sequences 
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for each of the three sets such that the unique sequences will be useful in detecting expression in a 
variety of individuals rather than being specific for a polymorphism present in some individuals. 
Alternatively, sequences unique to an individual or a subpopulation may be used. The preferred 
unique sequences are preferably of the lengths of polynucleotides of the invention as discussed 
5 herein. 

To determine the (increased or decreased) expression levels of the above described 
sequences in the practice of the present invention, any method known in the art may be utilized. In 
one preferred embodiment of the invention, expression based on detection of RNA which 
hybridizes to polynucleotides containing the above described sequences is used. This is readily 

10 performed by any RNA detection or amplification+detection method known or recognized as 
equivalent in the art such as, but not limited to, reverse transcription-PCR (optionally real-time 
PCR), the methods disclosed in U.S. Patent Application 10/062,857 entitled "Nucleic Acid 
Amplification" filed on October 25, 2001 as well as U.S. Provisional Patent Applications 
60/298,847 (filed June 15, 2001) and 60/257,801 (filed December 22, 2000), the methods disclosed 

15 in U.S. Patent 6,29 1 , 1 70, and quantitative PCR. Methods to identify increased RNA stability 
(resulting in an observation of increased expression) or decreased RNA stability (resulting in an 
observation of decreased expression) may also be used. These methods include the detection of 
sequences that increase or decrease the stability of mRNAs containing the IL17BR, CACNA1D, or 
HOXB13 sequences disclosed herein. These methods also include the detection of increased 

20 mRNA degradation. 

In particularly preferred embodiments of the invention, polynucleotides having sequences 
present in the 3' untranslated and/or non-coding regions of the above disclosed sequences are used 
to detect expression or non-expression of IL17BR, CACNA1D, or HOXB13 sequences in breast 
cells in the practice of the invention. Such polynucleotides may optionally contain sequences found 

25 in the 3' portions of the coding regions of the above disclosed sequences. Polynucleotides 

containing a combination of sequences from the coding and 3' non-coding regions preferably have 
the sequences arranged contiguously, with no intervening heterologous sequence(s). 

Alternatively, the invention may be practiced with polynucleotides having sequences present 
in the 5' untranslated and/or non-coding regions of IL17BR, CACNA1D, or HOXB13 sequences in 
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breast cells to detect their levels of expression. Such polynucleotides may optionally contain 
sequences found in the 5' portions of the coding regions. Polynucleotides containing a combination 
of sequences from the coding and 5' non-coding regions preferably have the sequences arranged 
contiguously, with no intervening heterologous sequence(s). The invention may also be practiced 
5 with sequences present in the coding regions of IL17BR, CACNA1D, or HOXB13. 

Preferred polynucleotides contain sequences from 3' or 5' untranslated and/or non-coding 
regions of at least about 16, at least about 18, at least about 20, at least about 22, at least about 24, at 
least about 26, at least about 28, at least about 30, at least about 32, at least about 34, at least about 
36, at least about 38, at least about 40, at least about 42, at least about 44, or at least about 46 

10 consecutive nucleotides. The term "about" as used in the previous sentence refers to an increase or 
decrease of 1 from the stated numerical value. Even more preferred are polynucleotides containing 
sequences of at least or about 50, at least or about 100, at least about or 150, at least or about 200, at 
least or about 250, at least or about 300, at least or about 350, or at least or about 400 consecutive 
nucleotides. The term "about" as used in the preceding sentence refers to an increase or decrease of 

1 5 1 0% from the stated numerical value. 

Sequences from the 3' or 5' end of the above described coding regions as found in 
polynucleotides of the invention are of the same lengths as those described above, except that they 
would naturally be limited by the length of the coding region. The 3' end of a coding region may 
include sequences up to the 3' half of the coding region. Conversely, the 5 9 end of a coding region 

20 may include sequences up the 5' half of the coding region. Of course the above described 

sequences, or the coding regions and polynucleotides containing portions thereof, may be used in 
their entireties. 

Polynucleotides combining the sequences from a 3' untranslated and/or non-coding region 
and the associated 3' end of the coding region are preferably at least or about 100, at least about or 
25 150, at least or about 200, at least or about 250, at least or about 300, at least or about 350, or at 

least or about 400 consecutive nucleotides. Preferably, the polynucleotides used are from the 3' end 
of the gene, such as within about 350, about 300, about 250, about 200, about 150, about 100, or 
about 50 nucleotides from the polyadenylation signal or polyadenylation site of a gene or expressed 
sequence. Polynucleotides containing mutations relative to the sequences of the disclosed genes 
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may also be used so long as the presence of the mutations still allows hybridization to produce a 
detectable signal. 

In another embodiment of the invention, polynucleotides containing deletions of nucleotides 
from the 5' and/or 3' end of the above disclosed sequences may be used. The deletions are 
5 preferably of 1-5, 5-10, 10-15, 15-20, 20-25, 25-30, 30-35, 35-40, 40-45, 45-50, 50-60, 60-70, 70- 
80, 80-90, 90-100, 100-125, 125-150, 150-175, or 175-200 nucleotides from the 5' and/or 3' end, 
although the extent of the deletions would naturally be limited by the length of the disclosed 
sequences and the need to be able to use the polynucleotides for the detection of expression levels. 

Other polynucleotides of the invention from the 3' end of the above disclosed sequences 
10 include those of primers and optional probes for quantitative PCR. Preferably, the primers and 
probes are those which amplify a region less than about 350, less than about 300, less than about 
250, less than about 200, less than about 150, less than about 100, or less than about 50 nucleotides 
from the from the polyadenylation signal or polyadenylation site of a gene or expressed sequence. 

In yet another embodiment of the invention, polynucleotides containing portions of the 
15 above disclosed sequences including the 3' end may be used in the practice of the invention. Such 
polynucleotides would contain at least or about 50, at least or about 100, at least about or 150, at 
least or about 200, at least or about 250, at least or about 300, at least or about 350, or at least or 
about 400 consecutive nucleotides from the 3' end of the disclosed sequences. 

The invention thus also includes polynucleotides used to detect IL17BR, CACNA1D, or 
20 HOXB13 expression in breast cells. The polynucleotides may comprise a shorter polynucleotide 
consisting of sequences found in the above provided SEQ ID NOS in combination with 
heterologous sequences not naturally found in combination with IL17BR, CACNA1D, or HOXB13 
sequences. 

As non-limiting examples, a polynucleotide comprising one of the following sequences may 
25 be used in the practice of the invention. 



SEQ ID NO:8: 

CAATTACAGGGAAAAAACGTGTGATGATCCTGAAGCTTACTATGCAGCCTACAAACAGCC 



30 SEQ ID NO:9: 
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GCTCTCACTGGCAAATGACAGCTCTGTGCAAGGAGCACTCCCAAGTATAAAAATTATTAC 
SEQ ID NO: 10: 

GATCGTTAGCCTCATATTTTCTATCTAGAGCTCTGTAGAGCACTTTAGAAACCGCTTTCA 

• 5 

Stated differently, the invention may be practiced with a polynucleotide consisting of the 
sequence of SEQ ID NOS:8, 9 or 10 in combination with one or more heterologous sequences that 
are not normally found with SEQ ID NOS:8, 9 or 10. Alternatively, the invention may also be 
practiced with a polynucleotide consisting of the sequence of SEQ ID NOS:8, 9 or 10 in 
10 combination with one or more naturally occurring sequences that are normally found with SEQ ID 
NOS:8,9or 10. 

Polynucleotides with sequences comprising SEQ ID NOS:8 or 9, either naturally occurring 
or synthetic, may be used to detect nucleic acids which are over expressed in breast cancer cells that 
are responsive, and those which are not over expressed in breast cancer cells that are non- 

1 5 responsive, to treatment with TAM or another "antiestrogen" agent against breast cancer. 

Polynucleotides with sequences comprising SEQ ID NO: 10, either naturally occurring or synthetic, 
may be used to detect nucleic acids which are under expressed in breast cancer cells that are 
responsive, and those which are not under expressed in breast cancer cells that are non-responsive, 
to treatment with TAM or another "antiestrogen" agent against breast cancer. 

20 Additional sequences that may be used in polynucleotides as described above for SEQ ID 

NOS:8 and 9 are the following, wherein SEQ ID NOs:33 is complementary to a portion of IL17BR 
sequences disclosed herein: 

SEQEDNO:ll: 

25 TGCCTAATTTCACTCTCAGAGTGAGGCAGGTAACTGGGGCTCCACTGGGTCACTCTGAGA 
SEQ ID NO: 12: 

TTGGAAGCAGAGTCCCTCTAAAGGTAACTCTTGTGGTCACTCAATATTGTATTGGCATTT 
30 SEQK)NO:13: 

ACGTTAGACTTTTGCTGGCATTCAAGTCATGGCTAGTCTGTGTATTTAATAAATGTGTGT 
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SEQIDNO:14: 

CTGGTCAGCCACTCTGACTTTTCTACCACATTAAATTCTCCATTACATCTCACTATTGGT 



SEQIDNO:15: 

5 TACAACTTCTGAATGCTGCACATTCTTCCAAAATGATCCTTAGCACAATCTATTGTATGA 



SEQIDNO:16: 

GGGATGGCCTTTAGGCCACAGTAGTGTCTGTGTTAAGTTCACTAAATGTGTATTTAATGA 
10 SEQEDNO:17: 

CTCAAAGTGCTAAAGCTATGGTTGACTGCTCTGGTGTTTTTATATTCATTCGTGCTTTAG 
SEQ ID NO:32: 

CTGAAGCTTACTATGCAGCCTACAA 

15 

SEQ ID NO:33: 

TCCAATCGTTAGTTAATGCTACATTAGTT 

SEQ ID NO:34: 
20 CAGCCTTAGTAATTAAAAC 



Additional sequences that may be used in polynucleotides as described above for SEQ ID 
NO: 10 are the following, wherein SEQ ID NOs:36 is complementary to a portion of IL17BR 
sequences disclosed herein: 

25 

SEQ ID NO: 18: 

CTATGGGGATGGTCCACTGTCACTGTTTCTCTGCTGTTGCAAATACATGGATAACACATT 



SEQ ID NO: 19: 

30 ACTGGAAAAGCAGATGGTCTGACTGTGCTATGGCCTCATCATCAAGACTTTCAATCCTAT 
SEQ ID NO:20: 

ACGCCAAGCTCTTCAGTGAAGACACGATGTTATTAAAAGCCTGTTTTAGGGACTGCAAAA 
35 SEQIDNO:21: 

TTTTTGTAAAATCTTTAACCTTCCCTTTGTTCTTCATGTACACGCTGAACTGCAATTCTT 
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SEQ ID NO:22: 

AACCTGGGGCATTTAGGGCAGAGGACAAAAGGATGTCAGCAATTGCTTGGGCTGCTTGGC 



SEQ ID NO:23: 

5 CTGGAACCTCTGGACTCCCCATGCTCTAACTCCCACACTCTGCTATCAGAAACTTAAACT 



SEQ ID NO:24: 

AACCCCAGAACCATCTAAGACATGGGATTCAGTGATCATGTGGTTCTCCTTTTAACTTAC 



10 SEQ ED NO:25: 

GGCCATGTGCCATGGTATTTGGGTCCTGGGAGGGTGGGTGAAATAAAGGCATACTGTCTT 



SEQ ID NO:26: 

GTGTAGGCAGTCATGGCACCAAAGCCACCAGACTGACAAATGTGTATCAGATGCTTTTGT 

15 

SEQ ID NO:27: 

GAAAACCTCTTCAAAAGACAAAAAGCTGGCACTGCATTCTCTCTCTGTAGCAGGACAGAA 
SEQ ID NO:28: 

20 CACATCTTTAGGGTCAGTGAACAATGGGGCACATTTGGCACTAGCTTGAGCCCAACTCTG 



SEQ ID NO:29: 

GCCTTAATTTCCTCATCTGAAAACTGGAAGGCCTGACTTGACTTGTTGAGCTTAAGATCC 



25 SEQ ID NO:30: 

CTTCAGGGGAGGATCAAGCTTTGAACCAAAGCCAATCACTGGCTTGATTTGTGTTTTTTA 



SEQIDNO:3l: 

ACAAGTTTTCACTGAATGAGCATGGCAGTGCCACTCAAGAAAATGAATCTCCAAAGTATC 

30 

SEQ ID NO:35: 

GCCATGATCGTTAGCCTCATATT 

SEQ ID NO:36: 
35 CAATTCATGAAAGCGGTTTCTAAAG 

SEQ ID NO:37: 

TCTATCTAGAGCTCTGTAGAGC 
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Additionally, polynucleotides containing other sequences, particularly unique sequences, 
present in naturally occurring nucleic acid molecules comprising SEQ ID NOS:8-37 may be used in 
the practice of the invention. 

Other polynucleotides for use in the practice of the invention include those that have 
5 sufficient homology to those described above to detect expression by use of hybridization 
techniques. Such polynucleotides preferably have about or 95%, about or 96%, about or 97%, 
about or 98%, or about or 99% identity with IL17BR, CACNA1D, or HOXB13 sequences as 
described herein. Identity is determined using the BLAST algorithm, as described above. The 
other polynucleotides for use in the practice of the invention may also be described on the basis of 

10 the ability to hybridize to polynucleotides of the invention under stringent conditions of about 30% 
v/v to about 50% formamide and from about 0.01M to about 0.15M salt for hybridization and from 
about 0.01M to about 0.1 5M salt for wash conditions at about 55 to about 65°C or higher, or 
conditions equivalent thereto. 

In a further embodiment of the invention, a population of single stranded nucleic acid 

1 5 molecules comprising one or both strands of a human IL1 7BR or CACNA1D sequence is provided 
as a probe such that at least a portion of said population may be hybridized to one or both strands of 
a nucleic acid molecule quantitatively amplified from RNA of a breast cancer cell. The population 
may be only the antisense strand of a human IL17BR or CACNA1D sequence such that a sense 
strand of a molecule from, or amplified from, a breast cancer cell may be hybridized to a portion of 

20 said population. The population preferably comprises a sufficiently excess amount of said one or 
both strands of a human IL17BR or CACNA1D sequence in comparison to the amount of expressed 
(or amplified) nucleic acid molecules containing a complementary IL17BR or CACNA1D sequence 
from a normal breast cell. This condition of excess permits the increased amount of nucleic acid 
expression in a breast cancer cell to be readily detectable as an increase. 

25 Alternatively, the population of single stranded molecules is equal to or in excess of all of 

one or both strands of the nucleic acid molecules amplified from a breast cancer cell such that the 
population is sufficient to hybridize to all of one or both strands. Preferred cells are those of a 
breast cancer patient that is ER+ or for whom treatment with tamoxifen or one or more other 
"antiestrogen" agent against breast cancer is contemplated. The single stranded molecules may of 
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course be the denatured form of any IL17BR and/or CACNA1D sequence containing double 
stranded nucleic acid molecule or polynucleotide as described herein. 

The population may also be described as being hybridized to IL17BR or CACNA1D 
sequence containing nucleic acid molecules at a level of at least twice as much as that by nucleic 
5 acid molecules of a normal breast cell. As in the embodiments described above, the nucleic acid 
molecules may be those quantitatively amplified from a breast cancer cell such that they reflect the 
amount of expression in said cell. 

The population is preferably immobilized on a solid support, optionally in the form of a 
location on a microarray. A portion of the population is preferably hybridized to nucleic acid 
10 molecules quantitatively amplified from a non-normal or abnormal breast cell by RNA 

amplification. The amplified RNA may be that derived from a breast cancer cell, as long as the 
amplification used was quantitative with respect to IL17BR or CACNA1D containing sequences. 

In another embodiment of the invention, expression based on detection of DNA status may 
be used. Detection of the HOXB13 DNA as methylated, deleted or otherwise inactivated, may be 
15 used as an indication of decreased expression as found in non-normal breast cells. This may be 
readily performed by PCR based methods known in the art. The status of the promoter regions of 
HOXB13 may also be assayed as an indication of decreased expression of HOXB13 sequences. A 
non-limiting example is the methylation status of sequences found in the promoter region. 

Conversely, detection of the DNA of a sequence as amplified may be used for as an 
20 indication of increased expression as found in non-normal breast cells. This may be readily 
performed by PCR based, fluorescent in situ hybridization (FISH) and chromosome in situ 
hybridization (CISH) methods known in the art. 

A preferred embodiment using a nucleic acid based assay to determine expression is by 
immobilization of one or more of the sequences identified herein on a solid support, including, but 
25 not limited to, a solid substrate as an array or to beads or bead based technology as known in the art. 
Alternatively, solution based expression assays known in the art may also be used. The 
immobilized sequence(s) may be in the form of polynucleotides as described herein such that the 
polynucleotide would be capable of hybridizing to a DNA or RNA corresponding to the 
sequence(s). 
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The immobilized polynucleotide(s) may be used to determine the state of nucleic acid 
samples prepared from sample breast cancer cell(s), optionally as part of a method to detect ER 
status in said cell(s). Without limiting the invention, such a cell may be from a patient suspected of 
being afflicted with, or at risk of developing, breast cancer. The immobilized polynucleotide(s) 
5 need only be sufficient to specifically hybridize to the corresponding nucleic acid molecules derived 
from the sample (and to the exclusion of detectable or significant hybridization to other nucleic acid 
molecules). 

In yet another embodiment of the invention, a ratio of the expression levels of two of the 
disclosed genes may be used to predict response to treatment with TAM or another SERM. 

10 Preferably, the ratio is that of two genes with opposing patterns of expression, such as an 

underexpressed gene to an overexpressed gene, in correlation to the same phenotype. Non-limiting 
examples include the ratio of HOXB13 over IL17BR or the ratio of HOXB13 over CACNA1D. 
This aspect of the invention is based in part on the observation that such a ratio has a stronger 
correlation with TAM treatment outcome than the expression level of either gene alone. For 

1 5 example, the ratio of HOXB 1 3 over IL1 7BR has an observed classification accuracy of 77%. 

As a non-limiting example, the Ct values from Q-PCR based detection of gene expression 
levels may be used to derive a ratio to predict the response to treatment with one or more 
"antiestrogen" agent against breast cancer. 

Additional Embodiments of the Invention 

20 In embodiments where only one or a few genes are to be analyzed, the nucleic acid derived 

from the sample breast cancer cell(s) may be preferentially amplified by use of appropriate primers 
such that only the genes to be analyzed are amplified to reduce contaminating background signals 
from other genes expressed in the breast cell. Alternatively, and where multiple genes are to be 
analyzed or where very few cells (or one cell) is used, the nucleic acid from the sample may be 

25 globally amplified before hybridization to the immobilized polynucleotides. Of course RNA, or the 
cDNA counterpart thereof may be directly labeled and used, without amplification, by methods 
known in the art. 

Sequence expression based on detection of a presence, increase, or decrease in protein levels 
or activity may also be used. Detection may be performed by any immunohistochemistry (IHC) 
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based, bodily fluid based (where a IL17BR, CACNA1D, and/or HOXB13 polypeptide is found in a 
bodily fluid, such as but not limited to blood), antibody (including autoantibodies against the 
protein where present) based, ex foliate cell (from the cancer) based, mass spectroscopy based, and 
image (including used of labeled ligand where available) based method known in the art and 
5 recognized as appropriate for the detection of the protein. Antibody and image based methods are 
additionally useful for the localization of tumors after determination of cancer by use of cells 
obtained by a non-invasive procedure (such as ductal lavage or fine needle aspiration), where the 
source of the cancerous cells is not known. A labeled antibody or ligand may be used to localize 
the carcinoma(s) within a patient. 

10 Antibodies for use in such methods of detection include polyclonal antibodies, optionally 

isolated from naturally occurring sources where available, and monoclonal antibodies, including 
those prepared by use of IL17BR, CACNA1D, and/or HOXB13 polypeptides as antigens. Such 
antibodies, as well as fragments thereof (including but not limited to F a b fragments) function to 
detect or diagnose non-normal or cancerous breast cells by virtue of their ability to specifically bind 

15 IL17BR, CACNA1D, or HOXB13 polypeptides to the exclusion of other polypeptides to produce a 
detectable signal. Recombinant, synthetic, and hybrid antibodies with the same ability may also be 
used in the practice of the invention. Antibodies may be readily generated by immunization with a 
IL17BR, CACNA1D, or HOXB13 polypeptide, and polyclonal sera may also be used in the practice 
of the invention. 

20 Antibody based detection methods are well known in the art and include sandwich and 

ELISA assays as well as Western blot and flow cytometry based assays as non-limiting examples. 
Samples for analysis in such methods include any that contain IL17BR, CACNA1D, or HOXB13 
polypeptides. Non-limiting examples include those containing breast cells and cell contents as well 
as bodily fluids (including blood, serum, saliva, lymphatic fluid, as well as mucosal and other 

25 cellular secretions as non-limiting examples) that contain the polypeptides. 

The above assay embodiments may be used in a number of different ways to identify or 
detect the response to treatment with TAM or another "antiestrogen" agent against breast cancer 
based on gene expression in a breast cancer cell sample from a patient. In some cases, this would 
reflect a secondary screen for the patient, who may have already undergone mammography or 
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physical exam as a primary screen. If positive from the primary screen, the subsequent needle 
biopsy, ductal lavage, fine needle aspiration, or other analogous minimally invasive method may 
provide the sample for use in the assay embodiments before, simultaneous with, or after assaying 
for ER status. The present invention is particularly useful in combination with non-invasive 
5 protocols, such as ductal lavage or fine needle aspiration, to prepare a breast cell sample. 

The present invention provides a more objective set of criteria, in the form of gene 
expression profiles of a discrete set of genes, to discriminate (or delineate) between breast cancer 
outcomes. In particularly preferred embodiments of the invention, the assays are used to 
discriminate between good and poor outcomes after treatment with tamoxifen or another 
10 "antiestrogen" agent against breast cancer. Comparisons that discriminate between outcomes after 
about 10, about 20, about 30, about 40, about 50, about 60, about 70, about 80, about 90, about 100, 
or about 1 50 months may be performed. 

While good and poor survival outcomes may be defined relatively in comparison to each 
other, a "good" outcome may be viewed as a better than 50% survival rate after about 60 months 
1 5 post surgical intervention to remove breast cancer tumor(s). A "good" outcome may also be a better 
than about 60%, about 70%, about 80% or about 90% survival rate after about 60 months post 
surgical intervention. A "poor" outcome may be viewed as a 50% or less survival rate after about 
60 months post surgical intervention to remove breast cancer tumor(s). A "poor" outcome may also 
be about a 70% or less survival rate after about 40 months, or about a 80% or less survival rate after 
20 about 20 months, post surgical intervention. 

In another embodiment of the invention based on the expression of a few genes, the isolation 
and analysis of a breast cancer cell sample may be performed as follows: 

(1) Ductal lavage or other non-invasive procedure is performed on a patient to obtain a sample. 

(2) Sample is prepared and coated onto a microscope slide. Note that ductal lavage results in 
25 clusters of cells that are cytologically examined as stated above. 

(3) Pathologist or image analysis software scans the sample for the presence of atypical cells. 

(4) If atypical cells are observed, those cells are harvested (e.g. by microdissection such as 
LCM). 

(5) RNA is extracted from the harvested cells. 

57 



PATENT 

Atty. Dkt. No. 022041001420 



(6) RNA is assayed, directly or after conversion to cDNA or amplification therefrom, for the 
expression of IL17BR, CACNA1D, and/or HOXB13 sequences. 

With use of the present invention, skilled physicians may prescribe or withhold treatment 
5 with TAM or another "antiestrogen" agent against breast cancer based on prognosis determined via 
practice of the instant invention. 

The above discussion is also applicable where a palpable lesion is detected followed by fine 
needle aspiration or needle biopsy of cells from the breast. The cells are plated and reviewed by a 
pathologist or automated imaging system which selects cells for analysis as described above. 

10 The present invention may also be used, however, with solid tissue biopsies, including those 

stored as an FFPE specimen. For example, a solid biopsy may be collected and prepared for 
visualization followed by determination of expression of one or more genes identified herein to 
determine the breast cancer outcome. As another non-limiting example, a solid biopsy may be 
collected and prepared for visualization followed by determination of HOXB13, IL17BR and/or 

15 CACNA1D expression. One preferred means is by use of in situ hybridization with polynucleotide 
or protein identifying probe(s) for assaying expression of said gene(s). 

In an alternative method, the solid tissue biopsy may be used to extract molecules followed 
by analysis for expression of one or more gene(s). This provides the possibility of leaving out the 
need for visualization and collection of only cancer cells or cells suspected of being cancerous. 

20 This method may of course be modified such that only cells that have been positively selected are 
collected and used to extract molecules for analysis. This would require visualization and selection 
as a prerequisite to gene expression analysis. In the case of an FFPE sample, cells may be obtained 
followed by RNA extraction, amplification and detection as described herein. 

25 In a further alternative to all of the above, the sequence(s) identified herein may be used as 

part of a simple PCR or array based assay simply to determine the response to treatment with TAM 
or another "antiestrogen" agent against breast cancer by use of a sample from a non-invasive or 
minimally invasive sampling procedure. The detection of sequence expression from samples may 
be by use of a single microarray able to assay expression of the disclosed sequences as well as other 
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sequences, including sequences known not to vary in expression levels between normal and non- 
normal breast cells, for convenience and improved accuracy. 

Other uses of the present invention include providing the ability to identify breast cancer cell 
samples as having different responses to treatment with TAM or another "antiestrogen" agent 
5 against breast cancer for further research or study. This provides an advance based on objective 
genetic/molecular criteria. 

The genes identified herein also may be used to generate a model capable of predicting the 
breast cancer survival and recurrence outcomes of an ER+ breast cell sample based on the 
expression of the identified genes in the sample. Such a model may be generated by any of the 
10 algorithms described herein or otherwise known in the art as well as those recognized as equivalent 
in the art using gene(s) (and subsets thereof) disclosed herein for the identification of breast cancer 
outcomes. The model provides a means for comparing expression profiles of gene(s) of the subset 
from the sample against the profiles of reference data used to build the model. The model can 
compare the sample profile against each of the reference profiles or against a model defining 
15 delineations made based upon the reference profiles. Additionally, relative values from the sample 
profile may be used in comparison with the model or reference profiles. 

In a preferred embodiment of the invention, breast cell samples identified as normal and 
cancerous from the same subject may be analyzed, optionally by use of a single microarray, for 
. their expression profiles of the genes used to generate the model. This provides an advantageous 
20 means of identifying survival and recurrence outcomes based on relative differences from the 
expression profile of the normal sample. These differences can then be used in comparison to 
differences between normal and individual cancerous reference data which was also used to 
generate the model. 

25 Articles of Manufacture 

The materials and methods of the present invention are ideally suited for preparation of kits 
produced in accordance with well known procedures. The invention thus provides kits comprising 
agents (like the polynucleotides and/or antibodies described herein as non-limiting examples) for 
the detection of expression of the disclosed sequences. Such kits, optionally comprising the agent 

59 



PATENT 

Atty. Dkt. No. 022041001420 



with an identifying description or label or instructions relating to their use in the methods of the 
present invention, are provided. Such a kit may comprise containers, each with one or more of the 
various reagents (typically in concentrated form) utilized in the methods, including, for example, 
pre-fabricated microarrays, buffers, the appropriate nucleotide triphosphates (e.g., dATP, dCTP, 
5 dGTP and dTTP; or rATP, rCTP, rGTP and UTP), reverse transcriptase, DNA polymerase, RNA 
polymerase, and one or more primer complexes of the present invention (e.g., appropriate length 
poly(T) or random primers linked to a promoter reactive with the RNA polymerase). A set of 
instructions will also typically be included. 

The methods provided by the present invention may also be automated in whole or in part. 

10 All aspects of the present invention may also be practiced such that they consist essentially of a 
subset of the disclosed genes to the exclusion of material irrelevant to the identification of breast 
cancer survival outcomes via a cell containing sample. 

Having now generally described the invention, the same will be more readily understood 
through reference to the following examples which are provided by way of illustration, and are not 

15 intended to be limiting of the present invention, unless specified. 

Examples 

Example 1 
General methods 

Patient and tumor selection criteria and study design 
20 Patient inclusion criteria for this study were: Women diagnosed at the Massachusetts 

General Hospital (MGH) between 1987 and 2000 with ER positive breast cancer, treatment with 
standard breast surgery (modified radical mastectomy or lumpectomy) and radiation followed by 
five years of systemic adjuvant tamoxifen; no patient received chemotherapy prior to recurrence. 
Clinical and follow-up data were derived from the MGH tumor registry. There were no missing 
25 registry data and all available medical records were reviewed as a second tier of data confirmation. 

All tumor specimens collected at the time of initial diagnosis were obtained from frozen and 
formalin fixed paraffin-embedded (FFPE) tissue repositories at the Massachusetts General Hospital. 
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Tumor samples with greater than 20% tumor cells were selected with a median of greater than 75% 
for all samples. Each sample was evaluated for the following features: tumor type (ductal vs. 
lobular), tumor size, and Nottingham combined histological grade. Estrogen and progesterone 
receptor expression were determined by biochemical hormone binding analysis and/or by 
5 immunohistochemical staining as described (Long, A.A. et al. "High-specificity in-situ 

hybridization. Methods and application." Diagn Mol Pathol 1, 45-57 (1992)); receptor positivity 
was defined as greater than 3 fmol/mg tumor tissue (Long et al.) and greater than 1% nuclear 
staining for the biochemical and immunohistochemical assays, respectively. 

Study design is as follows: A training set of 60 frozen breast cancer specimens was selected 
10 to identify gene expression signatures predictive of outcome or response, in the setting of adjuvant 
tamoxifen therapy. Tumors from responders were matched to the non-responders with respect to 
TNM staging and tumor grade. Differential gene expression identified in the training set was 
validated in an independent group of 20 invasive breast tumors with formalin fixed paraffin- 
embedded (FFPE) tissue samples. 

15 

LCM, RNA isolation and amplification 

With each frozen tumor sample within the 60-case cohort, RNA was isolated from both a 
whole tissue section of 8 jam in thickness and a highly enriched population of 4,000-5,000 malignant 
epithelial cells acquired by laser capture microdissection using a PixCell He LCM system (Arcturus, 

20 Mountain View, CA). From each tumor sample within the 20-case test set, RNA was isolated from 
four 8^m-thick FFPE tissue sections. Isolated RNA was subjected to one round of T7 polymerase 
in vitro transcription using the RiboAmp™ kit (frozen samples) or another system for FFPE 
samples according to manufacturer's instructions (Arcturus Bioscience, Inc., Mountain View, CA 
for RiboAmp™). Labeled cRNA was generated by a second round of T7-based RNA in vitro 

25 transcription in the presence of 5-[3-Aminoallyl]uridine 5' -triphosphate (Sigma-Aldrich, St. Louis, 
MO). Universal Human Reference RNA (Stratagene, San Diego, CA) was amplified in the same 
manner. The purified aRNA was later conjugated to Cy5 (experimental samples) or Cy3 (reference 
sample) dye (Amersham Biosciences). 
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Microarray analysis 

A custom designed 22,000-gene oligonucleotide (60mer) microarray was fabricated using 
ink-jet in-situ synthesis technology (Agilent Technologies, Palo Alto, CA). Cy5-labeled sample 
RNA and Cy3-labeled reference RNA were co-hybridized at 65°C, IX hybridization buffer (Agilent 
5 Technologies). Slides were washed at 37°C with 0.1X SSC/0.005% Triton X-102. Image analysis 
was performed using Agilent's image analysis software. Raw Cy5/Cy3 ratios were normalized 
using intensity-dependent non-linear regression. 

A data matrix consisting of normalized Cy5/Cy3 ratios from all samples were median 
centered for each gene. The variance of each gene over all samples was calculated and the top 25% 

10 high variance genes (5,475) selected for further analysis. Identification and permutation testing for 
significance of differential gene expression were performed using BRB ArrayTools, developed by 
Dr. Richard Simon and Amy Peng (see http site at linus.nci.nih.gov/BRB-ArrayTools.html). 
Hierarchical cluster analysis was performed with GeneMaths software (Applied-Maths, Belgium) 
using cosine correlation and complete linkage. All other statistical procedures (two-sample t-test, 

15 receiver operating characteristic analysis, multivariate logistic regression and survival analysis) 
were performed in the open source R statistical environment (see http site at www.r-project.org). 
Statistical test of significance of ROC curves was by the method of DeLong ("Comparing the areas 
under two or more correlated receiver operating characteristic curves: a nonparametric approach." 
Biometrics 44, 837-45 (1988)). Disease free survival was calculated from the date of diagnosis. 

20 Events were scored as the first distant metastasis, and patients remaining disease-free at the last 
follow-up were censored. Survival curves were calculated by the Kaplan-Meier estimates and 
compared by log-rank tests. 

Real-Time Quantitative PCR analysis 
25 Real-time PCR was performed on 59 of the 60-case training samples (one case was excluded 

due to insufficient materials) and the 20-case validation samples. Briefly, 2 jag of amplified RNA 
was converted into double stranded cDNA. For each case 12ng of cDNA in triplicates was used for 
real-time PCR with an ABI 7900HT (Applied Biosystems) as described (Gelmini, S. et al. 
"Quantitative polymerase chain reaction-based homogeneous assay with fluorogenic probes to 
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measure c-erbB-2 oncogene amplification." Clin Chem 43, 752-8 (1997)). The sequences of the 
PCR primer pairs and fluorogenic MGB probe (5' to 3'), respectively, that were used for each gene 
are as follows: 
HoxB13 

5 TTCATCCTGACAGTGGCAATAATC, 

CTAGATAGAAAATATGAGGCTAACGATCAT, 

VIC- CGATAACCAGTACTAGCTG; 

IL17BR 

GCATTAACTAACGATTGGAAACTACATT, 
10 1 GG AAG ATGCTTT ATTGTTGC ATT ATC , 

VIC-ACAACTTCAAAGCTGTTTTA. 

Relative expression levels of HOXB13 in normal, DCIS and IDC samples were calculated 
as follows. First, all CT values are adjusted by subtracting the highest CT (40) among all samples, 
then relative expression = 1 / 2 A CT. 

15 

In Situ Hybridization 

Dig-labeled RNA probes'were prepared using DIG RNA labeling kit (SP6/T7) from Roche 
Applied Science, following the protocol provided with the kit. In situ hybridization was performed 
on frozen tissue sections as described (Long et al.). 

20 



Table 1. Patients and tumor characteristics of training set. 





Tumor 














DFS 




Sample ID type 


Size 


Grade 


Nodes 


ER 


PR 


Age 


Status 


1389 


D 


1.7 


2 


0/1 


Pos 


Pos 


80 


94 


0 


648 


D 


1.1 


2 


0/15 


Pos 


ND 


62 


160 


0 


289 


D 


3 


2 


0/15 


Pos 


ND 


75 


63 


1 


749 


D 


1.8 


2 


2/9 


Pos 


Pos 


61 


137 


0 


420 


D/L 


2 


3 


ND 


Pos 


Pos 


72 


58 




633 


D 


2.7 


3 


0/11 


Pos 


ND 


61 


20 




662 


D 


1 


3 


6/11 


Pos 


Pos 


79 


27 




849 


D 


2 


1 


0/26 


Pos 


Neg 


75 


23 




356 


D 


1 


2 


2/20 


Pos 


ND 


58 


24 




1304 


D 


2 


3 


0/14 


Pos 


Pos 


57 


20 




1419 


D 


2.5 


2 


1/8 


Pos 


Pos 


59 


86 


0 
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1093 


D 


1 


3 


1/14 


Pos 


Pos 


66 


85 


0 


1047 


D/L 


2.6 


2 


0/18 


Pos 


Neg 


70 


128 


0 


1037 


D/L 


1.5 


2 


0/4 


Pos 


Pos 


85 


83 


0 


319 


D 


4 


2 


1/13 


Pos 


ND 


67 


44 


1 


25 


D 


3.5 


2 


0/9 


Neg 


Pos 


62 


75 


1 


180 


D 


1.6 


2 


2/19 


Pos 


Pos 


69 


169 


0 


687 


D 


3.5 


3 


3/16 


Pos 


ND 


73 


142 


0 


856 


D 


1.6 


2 


0/16 


Pos 


Pos 


73 


88 


0 


1045 


D 


2.5 


3 


1/12 


Pos 


Neg 


73 


121 


0 


1205 


D 


2.7 


2 


1/19 


Pos 


Pos 


71 


88 


0 


1437 


D 


1.7 


2 


2/22 


Pos 


Pos 


67 


89 


0 


1507 


D 


3.7 


3 


0/40 


Pos 


Pos 


70 


70 


0 


469 


D 


1 


1 


0/19 


Pos 


ND 


66 


161 


0 


829 


D 


1.2 


2 


0/9 


Pos 


ND 


69 


136 


0 


868 


D 


3 


3 


0/13 


Pos 


Pos 


65 


130 


0 


1206 


D 


4.1 


3 


0/15 


Pos 


Neg 


84 


56 


1 


843 


D 


3.4 


2 


11/20 


Pos 


Neg 


76 


122 


1 


342 


D 


3 


2 


9/21 


Pos 


ND 


62 


102 


1 


1218 


D 


4.5 


1 


3/16 


Pos 


Pos 


62 


10 


1 


547 


D/L 


1.5 


2 


ND 


Pos 


ND 


74 


129 


1 


1125 


D 


2.6 


2 


0/18 


Pos 


Pos 


54 


123 


0 


1368 


D 


2.6 


2 


ND 


Pos 


Pos 


82 


63 


0 


605 


D 


2.2 


2 


6/18 


Pos 


ND 


70 


110 


0 


59 


L 


3 


2 


33/38 


Pos 


ND 


70 


21 


1 


68 


D 


3 


2 


0/17 


Pos 


ND 


53 


38 


1 


317 


D 


1.2 


3 


1/10 


Pos 


Pos 


71 


5 


1 


374 


D 


1 


3 


0/15 


Pos 


Neg 


57 


47 


1 


823 


D 


2 


2 


0/6 


Pos 


Pos 


51 


69 


1 


280 


D 


2.2 


3 


0/12 


Pos 


ND 


66 


44 


1 


651 


D 


4.7 


3 


10/13 


Pos 


ND 


48 


137 


1 


763 


D 


1.8 


2 


0/14 


Pos 


Pos 


63 


118 




1085 


D 


4.7 


2 


0/8 


Pos 


Pos 


48 


101 


1 


1363 


D 


2.1 


2 


0/15 


Pos 


Pos • 


56 


114 




295 


D 


3.5 


2 


3/21 


Pos 


Pos 


52 


118 


1 


871 


D 


4 


3 


0/16 


Pos 


Neg 


61 


6 


1 


1343 


D 


2.5 


3 


ND 


Pos 


Pos 


79 


21 


1 


140 


L 


>2.0 


2 


18/28 


Pos 


ND 


63 


43 


1 


260 


D/L 


0.9 


2 


1/13 


Pos 


ND 


73 


42 




297 


D 


0.8 


2 


1/16 


Pos 


Pos 


66 


169 


0 


1260 


D 


3.5 


2 


0/14 


Pos 


Pos 


58 


79 


0 


1405 


D 


1 


3 


ND 


Pos 


Pos 


81 


95 


0 


518 


L 


5.5 


2 


3/20 


Pos 


ND 


68 


156 


0 


607 


D 


1.2 


2 


5/14 


Pos 


Pos 


76 


114 


0 


638 


D 


2 


2 


1/24 


Pos 


Pos 


67 


148 


0 


655 


D 


2 


3 


ND 


Pos 


Pos 


73 


143 


0 
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772 


D 


2.5 


2 


0/18 


Pos 


Pos 


68 


69 


1 


878 


D/L 


1.6 


2 


0/9 


Pos 


Neg 


76 


138 


0 


1279 


D 


2 


2 


0/12 


Pos 


Pos 


68 


102 


0 


1370 


D 


2 


2 


ND 


Pos 


Pos 


73 


61 


0 



Abbreviations: D, ductal; L, lobular; D/L. ductal and lobular features; pos, positive; neg, negative; 
ND, not determined; ER, estrogen receptor; PR, progesterone receptor; DFS, disease-free survival 
(number of months); status=l, recurred; status=0, disease-free. 

5 Example 2 

Identification of differentially expressed genes 

Gene expression profiling was performed using a 22,000-gene oligonucleotide microarray as 
described above. In the initial analysis, isolated RNA from frozen tumor-tissue sections taken from 
the archived primary biopsies were used. The resulting expression dataset was first filtered based 

10 on overall variance of each gene with the top 5,475 high- variance genes (75th percentile) selected 
for further analysis. Using this reduced dataset, t-test was performed on each gene comparing the 
tamoxifen responders and non-responders, leading to identification of 19 differentially expressed 
genes at the P value cutoff of 0.001 (Table 2). The probability of selecting this many or more 
differentially expressed genes by chance was estimated to be 0.04 by randomly permuting the 

15 patient class with respect to treatment outcome and repeating the t-test procedure 2,000 times. This 
analysis thus demonstrated the existence of statistically significant differences in gene expression 
between the primary breast cancers of tamoxifen responders and non-responders. 



Table 2. 19-gene signature identified by t-test in the Sections dataset 





Parametric p- 
value 


Mean in 
responders 


Mean in 
non- 

responder 
s 


Fold 

difference 
of means 


GB acc 


Description 


1 


1.96E-05 


0.759 


1.317 


0.576 


AW006861 


SCYA4 | small inducible cytokine A4 


2 


2.43E-05 


1.31 


0.704 


1.861 


AI240933 


ESTs 


3 


8.08E-05 


0.768 


1.424 


0.539 


X59770 


IL1R2 | interleukin 1 receptor, type II 


4 


9.57E-05 


0.883 


1.425 


0.62 


AB000520 


APS | adaptor protein with pleckstrin 
homology and src homology 2 domains 


5 


9.91 E-05 


1.704 


0.659 


2.586 


AF208111 


IL17BR | interleukin 17B receptor 


6 


0.0001833 


0.831 


1.33 


0.625 


AI820604 


ESTs 


7 


0.0001935 


0.853 


1.459 


0.585 


AI087057 


DOK2 | docking protein 2, 56kD 


8 


0.0001959 


1.29 


0.641 


2.012 


AJ272267 


CHDH | choline dehydrogenase I 
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9 


0.0002218 


1.801 


0.943 


1.91 


N30081 


ESTs, Weakly similar to I38022 
hypothetical protein [H. sapiens] 


4 ft 


0.0004234 


1.055 


2.443 


0.432 


AI700363 


ESTs 


11 


0.0004357 


0.451 


1.57 


0.287 


ALT 1 /4Ub 


ABCC11 | ATP-binding cassette, sub- 
family C (CFTR/MRP), member 1 1 


4 O 

u 


0.0004372 


1.12 


3.702 


0.303 


BC007092 


HOXB13 | homeo box B13 


13 


0.0005436 


0.754 


1.613 


0.467 


M92432 


GUCY2D | guanylate cyclase 2D, 
membrane (retina-specific) 


14 


0.0005859 


1.315 


0.578 


2.275 


AL050227 


Homo sapiens mRNA; cDNA 
DKFZp586M0723 (from clone 
DKFZp586M0723) 


15 


0.000635 


1.382 


0.576 


2.399 


AW613732 


Homo sapiens cDNA FLJ31 137 fis, clone 
IMR322001049 


16 


0.0008714 


0.794 


1.252 


0.634 


BC007783 


SCYA3 | small inducible cytokine A3 


17 


0.0008912 


2.572 


1.033 


2.49 


X81896 


C11orf25 | chromosome 11 open reading 
frame 25 


18 


0.0009108 


0.939 


1.913 


0.491 


BC004960 


MGC10955 | hypothetical protein 
MGC10955 


19 


0.0009924 


1.145 


0.719 


1.592 


AK027250 


Homo sapiens cDNA: FLJ23597 fis, 
clone LNG15281 



To refine our analysis to the tumor cells and circumvent potential variability attributable to 
stromal cell contamination, the same cohort was reanalyzed following laser-capture microdissection 
(LCM) of tumor cells within each tissue section. Using variance based gene filtering and t-test 
5 screening identical to that utilized for the whole tissue section dataset, 9 differentially expressed 
gene sequences were identified with P < 0.001 (Table 3). 



Table 3. 9-gene signature identified 


by t-tes< 


in the LCM dataset 




Parametric 
p-value 


Mean in 
responders 


Mean in 
non- 
res ponders 


Fold 

difference 
of means 


GB acc 


Description 


1 


2.67E-05 


1.101 


4.891 


0.225 


BC007092 


HOXB13|homeo box B13 


2 


0.0003393 


1.045 


2.607 


0.401 


AI700363 


ESTs 


3 


0.0003736 


0.64 


1.414 


0.453 


NM 014298 


QPRT | quinolinate 

phosphoribosyltransferase (nicotinate- 
nucleotide pyrophosphorylase 
(carboxylating)) 


4 


0.0003777 


1.642 


0.694 


2.366 


AF2081 1 1 


IL17BR | interleukin 17B receptor 


5 


0.0003895 


0.631 


1.651 


0.382 


AF033199 


ZNF204 | zinc finger protein 204 


6 


0.0004524 


1.97 


0.576 


3.42 


AI688494 


FLJ13189 | hypothetical protein 
FLJ13189 


7 


0.0005329 


1.178 


0.694 


1.697 


AI240933 


ESTs 


8 


0.0007403 


0.99 


1.671 


0.592 


AL1 57459 


Homo sapiens mRNA; cDNA 
DKFZp434B0425 (from clone 
DKFZp434B0425) 
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9 












FLJ 13352 | hypothetical protein 




0.0007739 


0.723 


1.228 


0.589 


BC002480 


FLJ 13352 



Only 3 genes were identified as differentially expressed in both the LCM and whole tissue 
section analyses: the homeobox gene HOXB13 (identified twice as AI700363 and BC007092), the 
5 interleukin 17B receptor IL17BR (AF2081 1 1), and the voltage-gated calcium channel CACNA1D 
(AI240933). HOXB13 was differentially overexpressed in tamoxifen nonresponsive cases, whereas 
IL17BR and CACNA1D were overexpressed in tamoxifen responsive cases.. Based on their 
identification as tumor-derived markers significantly associated with clinical outcome in two 
independent analyses, the utility of each of these genes was evaluated by itself and in combination 
10 with the others. 

To define the sensitivity and specificity of HOXB13, IL17BR and CACNA1D expression as 
markers of clinical outcome, Receiver Operating Characteristic (ROC) analysis (Pepe, M.S. "An 
interpretation for the ROC curve and inference using GLM procedures." Biometrics 56, 352-9 
(2000)) was used. For data derived from whole tissue sections, the Area Under the Curve (AUC) 
15 values for IL17BR, HOXB13 and CACNAID were 0.79, 0.67 and 0.81 for IL17BR, HOXB13 and 
CACNA1D, respectively (see Table 4 and Fig. 1, upper portion). ROC analysis of the data 
generated from the microdissected tumor cells yielded AUC values of 0.76, 0.8, and 0.76 for these 
genes (see Table 4 and Fig. 1, lower portion). 

20 Table 4. ROC analysis of using IL17BR, CACNAID and HOXB13 expression to predict 
tamoxifen response 

Tissue Sections LCM 





AUC 


P value 


AUC 


P value 


IL17BR 


0.79 


1.58E-06 


0.76 


2.73E-05 


CACNA1D 


0.81 


3.02 E-08 


0.76 


1.59E-05 


HOXB13 


0.67 


0.012 


0.79 


9.94E-07 


ESR1 


0.55 


0.277 


0.63 


0.038 


PGR 


0.63 


0.036 


0.63 


0.033 


ERBB2 


0.69 


0.004 


0.64 


0.027 


EGFR 


0.56 


0.200 


0.61 


0.068 
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AUC, area under the curve; P values are AUC > 0.5. 

A statistical test of significance indicated that these AUC values are all significantly greater 
5 than 0.5, the expected value from the null model that predicts clinical outcome randomly. 
Therefore, these three genes have potential utility for predicting clinical outcome of adjuvant 
tamoxifen therapy. As comparison, markers that are currently useful in evaluating the likelihood of 
response to tamoxifen were analyzed in comparison. The levels of ER (gene symbol ESR1) and 
progesterone receptor (PR, gene symbol PGR) are known to be positively correlated with tamoxifen 

10 response (see Fernandez, M.D., et al. "Quantitative oestrogen and progesterone receptor values in 
primary breast cancer and predictability of response to endocrine therapy." Clin Oncol 9, 245-50 
(1983); Ferno, M. et al. "Results of two or five years of adjuvant tamoxifen correlated to steroid 
receptor and S-phase levels." South Sweden Breast Cancer Group, and South-East Sweden Breast 
Cancer Group. Breast Cancer Res Treat 59, 69-76 (2000); Nardelli, G.B., et al. "Estrogen and 

1 5 progesterone receptors status in the prediction of response of breast cancer to endocrine therapy 

(preliminary report)." Eur J Gynaecol Oncol 7, 151-8 (1986); and Osborne, C.K., et al. "The value 
of estrogen and progesterone receptors in the treatment of breast cancer." U 46, 2884-8 (1980)). 

In addition, growth factor signaling pathways (EGFR, ERBB2) are thought to negatively 
regulate estrogen-dependent signaling, and hence contribute to loss of responsiveness to tamoxifen 

20 (see Dowsett, M. "Overexpression of HER-2 as a resistance mechanism to hormonal therapy for 

breast cancer." Endocr Relat Cancer 8, 191-5 (2001)). ROC analysis of these genes confirmed their 
correlation with clinical outcome, but with AUC values ranging only from 0.55 to 0.69, reaching 
statistical significance for PGR and ERBB2 (see Table 4). 

The LCM dataset is particularly relevant, since EGFR, ERBB2, ESR1 and PGR are 

25 currently measured at the tumor cell level using either immunohistochemistry or fluorescence in situ 
hybridization. As individual markers of clinical outcome, HOXB13, IL17BR and CAC1D all 
outperformed ESR1, PGR, EGFR and ERBB2 (see Table 4). 
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Example 3 

Identification of the HOXB13:IL17BR Expression Ratio 

HOXB13:IL17BR expression ratio was identified as a robust composite predictor of 
outcome as follows. Since HOXB13 and IL17BR have opposing patterns of expression, the 
5 expression ratio of HOXB 1 3 over IL1 7BR was examined to determine whether it provides a better 
composite predictor of tamoxifen response. Indeed, both t-test and ROC analyses demonstrated that 
the two-gene ratio had a stronger correlation with treatment outcome than either gene alone, both in 
the whole tissue sections and LCM datasets (see Table 5). AUC values for HOXB13:IL17BR 
reached 0.81 for the tissue sections dataset and 0.84 for the LCM dataset. Pairing HOXB 13 with 
10 CACNA1D or analysis of all three markers together did not provide additional predictive power. 



Table 5. HOXB13:IL17BR ratio is a stronger predictor of treatment outcome 



Tissue Section 



LCM 



AUC, area under the curve; P values are AUC > 0.5. 





Mest 




ROC 




f-statistic 


P value 


AUC 


P value 


IL17BR 


4.15 


1.15E-04 


0.79 


1.58E-06 


HOXB13 


-3.57 


1.03E-03 


0.67 


0.01 


HOXB13:IL17BR 


-4.91 


1.48E-05 


0.81 


1.08E-07 


IL17BR 


3.70 


5.44 E-04 


0.76 


2.73E-05 


HOXB13 


-4.39 


8.00E-05 


0.79 


9.94E-07 


HOXB13:IL17BR 


-5.42 


2.47E-06 


0.84 


4.40 E-11 



15 



The HOXB13/IL7BR ratio was compared to well-established prognostic factors for breast 
cancer, such as patient age, tumor size, grade and lymph node status (see Fitzgibbons, P.L. et al. 
"Prognostic factors in breast cancer. College of American Pathologists Consensus Statement 1999." 

20 Arch Pathol Lab Med 124, 966-78 (2000)). Univariate logistic regression analysis indicated that 

only tumor size was marginally significant in this cohort (P=0.04); this was not surprising given that 
the responder group was closely matched to the non-responder group with respect to tumor size, 
tumor grade and lymph node status during patient selection. Among the known positive (ESR1 and 
PGR) and negative (ERBB2 and EGFR) predictors of tamoxifen response, ROC analysis of the 

25 tissue sections data indicated that only PGR and ERBB2 were significant (see Table 4). Therefore, 
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a comparison of logistic regression models containing the HOXB13:IL17BR ratio either by itself or 
in combination with tumor size, and expression levels of PGR and ERBB2, were made (see Table 
6). The HOXB13:IL17BR ratio alone was a highly significant predictor (P=0.0003) and had an 
odds ratio of 10.2 (95%CI 2.9-35.6). In the multivariate model, HOXB13:IL17BR ratio is the only 
significant variable (P=0.002) with an odds ratio of 7.3 (95%CI 2.1-26). Thus, the expression ratio 
of HOXB13:IL17BR is a strong independent predictor of treatment outcome in the setting of 
adjuvant tamoxifen therapy. 

Table 6. Logistic Regression Analysis 
Univariate Model 

Predictor Odds Ratio 95% CI P Value 

HOXB13:IL17BR 10.17 2.9-35.6 0.0003 

Multivariate Model 

Predictors Odds Ratio 95% CI P Value 

Tumor size 1.5 0.7-3.5 0.3289 

PGR 0.8 0.3-1.8 0.5600 

ERBB2 1.7 0.8-3.8 0.1620 

HOXB13:IL17BR 7.3 2.1-26.3 0.0022 

All predictors are continuous variables. Gene expression values were from microarray 
measurements. Odds ratio is the inter-quartile odds ratio, based on the difference of a predictor from 
its lower quartile (0.25) to its upper quartile (0.75); CI, confidence interval. 

Example 4 

15 Independent validation of HOXB13:IL17BR expression ratio 

The reduction of a complex microarray signature to a two-gene expression ratio allows the 
use of simpler detection strategies, such as real-time quantitative PCR (RT-QPCR) analysis. The 
HOXB13:IL17BR expression ratio by RT-QPCR using frozen tissue sections that were available 
from 59 of the 60 training cases were analyzed (Fig 2, part a). RT-QPCR data were highly 
20 concordant with the microarray data of frozen tumor specimens (correlation coefficient r=0.83 for 
HOXB13, 0.93 for IL17BR). In addition, the PCR-derived HOXB13:IL17BR ratios, represented as 
ACTs, where CT is the PCR amplification cycles to reach a predetermined threshold amount (e.g., 
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Fig. 2, parts a and b) and ACT is the CT difference between HOXB13 and IL17BR, were highly 
correlated with the microarray-derived data (r= 0.83) and with treatment outcome (t test P=0.0001, 
Fig. 2, part c). Thus, conventional RT-QPCR analysis for the expression ratio of HOXB13 to 
IL17BR appears to be equivalent to microarray-based analysis of frozen tumor specimens. 

To validate the predictive utility of HOXB13:IL17BR expression ratio in an independent 
patient cohort, 20 additional ER-positive early-stage primary breast tumors from women treated 
with adjuvant tamoxifen only at MGH between 1991 and 2000, and for which medical records and 
paraffin-embedded tissues were available, were identified. Of the 20 archival cases, 10 had 
recurred with a median time to recurrence of 5 years, and 10 had remained disease- free with a 
median follow up of 9 years (see Table 7 for details). 



Table 7. Patient and tumor characteristics of the validation set. 























Sample 


Tumor 
Type 


Size 


Grade 


Node 

s 


ER 


PR 


Age 


DFS 


Status 






















Test 1 


D 


1.9 


3 


0/10 


Pos 


Pos 


69 


15 




Test 2 


D 


1.7 


3 


0/19 


Pos 


Pos 


61 


117 




Test 3 


D 


1.7 


2 


0/26 


Pos 


ND 


65 


18 




Test 4 


D 


1.2 


2 


0/19 


Pos 


Pos 


63 


69 




Test 5 


D 


1.7 


2 


2/2 


Pos 


Pos 


60 


52 




Test 6 


D 


1.1 


1 


0/10 


Pos 


Pos 


54 


59 




Test 7 


D 


>1.6 


2 


0/17 


Pos 


Neg 


66 


32 




Test 8 


L 


2.6 


1-2 


0/14 


Pos 


Pos 


58 


67 




Test 9 


D 


1.2 


2 


ND 


Pos 


Pos 


93 


58 




Test 10 


D 


4 


3 


0/20 


Pos 


Pos 


66 


27 
























Testl 1 


D 


1.1 


2 


0/19 


Pos 


Pos 


64 


97 


0 


Test 12 


D 


2.7 


2 


0/10 


Pos 


Pos 


66 


120 


0 


Test 13 


D 


0.9 


1 


0/22 


Pos 


Pos 


66 


123 


0 


Test 14 


D 


2.1 


2 


0/16 


Pos 


Pos 


57 


83 


0 


Test 15 


D 


0.8 


1-2 


0/8 


Pos 


Pos 


74 


80 


0 


Test 16 


D 


1 


2 


0/13 


Pos 


Pos 


74 


93 


0 


Test 17 


D 


1.6 


2 


0/29 


Pos 


Pos 


66 


121 


0 


Test 18 


L 


1.5 


1-2 


0/8 


Pos 


Pos 


65 


25 


0 


Test 19 


D 


1.5 


3 


0/16 


Pos 


Pos 


60 


108 


0 


Test 20* 


L 


4 


1-2 


0/19 


Pos 


Pos 


60 


108 


0 



Abbreviations: Same as supplemental 



able 



* Patient received tamoxifen 
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for 2 years. 

RNA was extracted from formalin-fixed paraffin-embedded (FFPE) whole tissue sections, 
linearly amplified, and used as template for RT-QPCR analysis. Consistent with the results of the 
5 training cohort, the HOXB13:IL17BR expression ratio in this independent patient cohort was highly 
correlated with clinical outcome (t test P=0.035) with higher HOXB13 expression (lower ACTs) 
correlating with poor outcome (Fig. 2, part d). To test the predictive accuracy of the 
HOXB13:IL17BR ratio, the RT-QPCR data from the frozen tissue sections (n=59) was used to 
build a logistic regression model. In this training set, the model predicted treatment outcome with 

10 an overall accuracy of 76% (P=0.000065, 95% confidence interval 63%-86%). The positive and 
negative predictive values were 78% and 75%, respectively. Applying this model to the 20 
independent patients in the validation cohort, treatment outcome for 15 of the 20 patients was 
correctly predicted (overall accuracy 75%, P=0.04, 95% confidence interval 51%-91%), with 
positive and negative predictive values of 78% and 73%, respectively. 

1 5 Kaplan-Meier analysis of the patient groups as predicted by the model resulted in 

significantly different disease-free survival curves in both the training set and the independent test 
set (Fig. 2, parts e and f). 

20 Additional References 

Ma, X.J. et al. Gene expression profiles of human breast cancer progression. 
Proc Natl Acad Sci U S A 100, 5974-9 (2003). 

Nicholson, R.I. et al. Epidermal growth factor receptor expression in breast 
25 cancer: association with response to endocrine therapy. Breast Cancer Res 
Treat 29, 117-25 (1994). 
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All references cited herein, including patents, patent applications, and publications, are 
hereby incorporated by reference in their entireties, whether previously specifically incorporated or 
not. 

Having now fully described this invention, it will be appreciated by those skilled in the art 
5 that the same can be performed within a wide range of equivalent parameters, concentrations, and 
conditions without departing from the spirit and scope of the invention and without undue 
experimentation. 

While this invention has been described in connection with specific embodiments thereof, it 
will be understood that it is capable of further modifications. This application is intended to cover 
10 any variations, uses, or adaptations of the invention following, in general, the principles of the 
invention and including such departures from the present disclosure as come within known or 
customary practice within the art to which the invention pertains and as may be applied to the 
essential features hereinbefore set forth. 
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Appendix 



Sequences identified as those of IL17BR cluster 



5 AW675096 

CCGGCGATGTCGCTCGTGCTGCTAAGCCTGGCCGCGCTGTGCAGGAGCGCCGTACCCCGA 
GAGCCGACCGTTCAATGTGGCTCTGAAACTGGGCCATCTCCAGAGTGGATGCTACAACAT 
GATCTAATCCCGGGAGACTTGAGGGACCTCCGAGTAGAACCTGTTACAACTAGTGTTGCA 
ACAGGGGACTATTCAATTTTGATGAATGTAAGCTGGGTACTCCGGGCAGATGCCAGCATC 

1 0 CGCTTGTTGAAGGCCACCAAGATTTGTGTGACGGGCAAAAGCAACTTCCAGTCCTACAGC 
TGTGTGAGGTGCAATTACACAGAGGCCTTCCAGACTCAGACCAGACCCTCTGGTGGTAAA 
TGGACATTTTCCTACATCGGCTTCCCTGTAGAGCTGAACACAGTCTATTTCATTGGGGCC 
CATAATATTCCTAATGCAAATATGAATGAAGATGGCCCTTCCATGTCTGTGAATNTCACC 
TCACCAGGCTGCCTAGACCACATAATGAAATATAAAAAAAAGTGTGTCAAGGCCGGAAGC 

15 CTGTGGGATCCGAACATCACT 

AW673932 

TTTTTTTTTTTTTTTTTTTAAAAGTGGGTTCAGCTTGTTTATTCCCTACTTTTGTTATCT 
TAAAAACAATGATTTTTTGCATGTAATAGAAGGTTTTTCACTTAAGATGCTATTGAGTGA 

20 ATCAGTGAGGGGTTCTTAGAGTTAGTATTCATTAATTAAACATAGAATATTAGCTAAACA 
GTTCTGGGTACACTGCAATGCATGGTCTATGGAAGACTAGATGTTTGGCTGAAGATGCTT 
TATTGTTGCATTATCAAAATGGTTATAGTTTTCAATTAAAACTGTAATTGATTTCTATGT 
ATAAAACAGCTTTGAAGTTGTAAATGTAGTTTCCAATCGTTAGTTAATGCTACATTAGTT 
AGCAATATTTGAAAATTTTATTGGTATAAAATGTTTTAATTACTAAGGCTGTTTGTAGGC 

25 TGCATAGTAAGCTTCAGGATCATCACACGTTTTTTCCCTGTAATTGG 

BC000980 

ggcccggcga tgtcgctcgt gctgctaagc ctggccgcgc tgtgcaggag cgccgtaccc 

cgagagccga ccgttcaatg tggctctgaa actgggccat ctccagagtg gatgctacaa 

30 catgatctaa tcccgggaga cttgagggac ctccgagtag aacctgttac aactagtgtt 

gcaacagggg actattcaat tttgatgaat gtaagctggg tactccgggc agatgccagc 

atccgcttgt tgaaggccac caagatttgt gtgacgggca aaagcaactt ccagtcctac 

agctgtgtga ggtgcaatta cacagaggcc ttccagactc agaccagacc ctctggtggt 

aaatggacat tttcctacat cggcttccct gtagagctga acacagtcta tttcattggg 

35 gcccataata ttcctaatgc aaatatgaat gaagatggcc cttccatgtc tgtgaatttc 

acctcaccag gctgcctaga ccacataatg aaatataaaa aaaagtgtgt caaggccgga 

agcctgtggg atccgaacat cactgcttgt aagaagaatg aggagacagt agaagtgaac 

ttcacaacca ctcccctggg aaacagatac atggctctta tccaacacag cactatcatc 

gggttttctc aggtgtttga gccacaccag aagaaacaaa cgcgagcttc agtggtgatt 

40 ccagtgactg gggatagtga aggtgctacg gtgcagctga ctccatattt tcctacttgt 
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ggcagcgact gcatccgaca taaaggaaca gttgtgctct gcccacaaac aggcgtccct 

ttccctctgg ataacaacaa aagcaagccg ggaggctggc tgcctctcct cctgctgtct 

ctgctggtgg ccacatgggt gctggtggca gggatctatc taatgtggag gcacgaaagg 

atcaagaaga cttccttttc taccaccaca ctactgcccc ccattaaggt tcttgtggtt 

5 tacccatctg aaatatgttt ccatcacaca atttgttact tcactgaatt tcttcaaaac 

cattgcagaa gtgaggtcat ccttgaaaag tggcagaaaa agaaaatagc agagatgggt 

ccagtgcagt ggcttgccac tcaaaagaag gcagcagaca aagtcgtctt ccttctttcc 

aatgacgtca acagtgtgtg cgatggtacc tgtggcaaga gcgagggcag tcccagtgag 

aactctcaag acctcttccc ccttgccttt aaccttttct gcagtgatct aagaagccag 

10 attcatctgc acaaatacgt ggtggtctac tttagagaga ttgatacaaa agacgattac 

aatgctctca gtgtctgccc caagtaccac ctcatgaagg atgccactgc tttctgtgca 

gaacttctcc atgtcaagca gcaggtgtca gcaggaaaaa gatcacaagc ctgccacgat 

ggctgctgct ccttgtagcc cacccatgag aagcaagaga ccttaaaggc ttcctatccc 

accaattaca gggaaaaaac gtgtgatgat cctgaagctt actatgcagc ctacaaacag 

15 ccttagtaat taaaacattt tataccaata aaattttcaa atattgctaa ctaatgtagc 

attaactaac gattggaaac tacatttaca acttcaaagc tgttttatac atagaaatca 

attacagttt taattgaaaa ctataaccat tttgataatg caacaataaa gcatcttcag 

ccaaacatct agtcttccat agaccatgca ttgcagtgta cccagaactg tttagctaat 

attctatgtt taattaatga atactaactc taagaacccc tcactgattc actcaatagc 

20 atcttaagtg aaaaaccttc tattacatgc aaaaaatcat tgtttttaag ataacaaaag 

tagggaataa acaagctgaa cccactttta aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 
aa 



BI602183 

25 AGCGGAGCTGCGGGTGGCCTGGATCCCGCGCAGTGGCCCGGCGATGTCGCTCGTGCTGCT 
AAGCCTGGCCACGCTGTGCAGGAGCGCCGTACCCCGAGAGCCGACCGTTCAATGTGGCTC 
TGAAACTGTGGACATTTTCCTATATCGGCTTCCCTGTAGAGCTGAAAACAGTCTATTTCA 
TTGGGGCCCATAATATTCCTAATGCAAATATGAATGAAGATGGCCCTTCCATGTCTGTGA 
ATTTCACCTCACCAGGCTGCCTAGACCACATAATGAAATATAAAAAAAGTGTGTCAAGGC 

30 CGGAAGCCTGTGGGATCCGAACATCACTGCTTGTAAGAAGAATGAGGAGACAGTAGAAGT 
GAACTTCACAACCACTCCCCTGGGAAACAGATACATGGCTCATCCAACACAGCACTATCA 
TCGGGTTTTCTCAGGTGTTTGAGCCACACCAGAAGAAACAAACGCGAGCTTCAGTGGTGA 
TTCCAGTGACTGGGGATAGTGAAGGTGCTACGGTGCAGCTGACTCCATATTTTCCTACTT 
GTGGCAGCGACTGCATCCGACATAAAGGAACAGTTGTGCTCTGCCCACAAACAGGCGTCC 

35 CTTTCCCCTCTGGATAACAACAAAAGCAAGCCGGGAGGCTGGCTGCCTCTCCTCCTGCTG 
TCTCTGCTGGTTGGCCACATTGGGTGCTGGTGGCAGGGATCTATCTAATGTGGAGGCACG 
AAAGGATCCAGAAGACTTCCTTTTCTACCACAAACTACTGCCCCCATTAAGGTCCTGTGG 
TTACCCATCTTG7^AATATGTTCCTCACACAATTTGTTACTTCACTGAATTCTTCAAAACC 
TG 

40 

BI458542 

AGCGGAGCGTGCGGGTGGCCTGGATCCCGCGCAGTGGCCCGGCGATGTCGCTCGTGCTGC 
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TAAGCCTGGCCACGCTGTGCAGGAGCGCCGTACCCCGAGAGCCGACCGTTCAATGTGGCT 
CTGAAACTGTGGACATTTTCCTATATCGGCTTCCCTGTAGAGCTGAAAACAGTCTATTTC 
ATTGGGGCCCATAATATTCCTAATGCAAATATGAATGAAGATGGCCCTTCCATGTCTGTG 
AATTTCACCTCACCAGGCTGCCTAGACCACATAATGAAATATAAAAAAAAGTGTGTCAAG 
5 GCCGGAAGCCTGTGGGATCCGAACATCACTGCTTGTAAGAAGAATGAGGAGACAGTAGAA 
GTGAACTTCACAACCACTCCCCTGGGAAACAGATACATGGCTCATCCAACACAGCACTAT 
CATCGGGTTTTCTCAGGTGTTTGAGCCACACCAGAAGAAACAAACGCGAGCTTCAGTGGT 
GATTCCAGTGACTGGGGATAGTGAAGGTGCTACGGTGCAGCTGACTCCATATTTTCCTAC 
TTGTGGCAGCGACTGCATCCGACATAAAGGAACAGTTGTGCTCTGCCCACAAACAGGCGT 
1 0 CCCTTTCCCTCTGGATAACAACAAAAGCAAGCCGGGAGGCTGGCTGCCTCTCCTCCTGCT 
GTCTCTGCTGGTGGNCACATTGGGTGCTGGTGGCAGGGATCTATCTAATGTGGAGGCACG 
AAAGGATCAGAAGACTTCCTTTTCTACCACCACATACTGCCCCCCATTAAGGTTCTTGTG 
GTTTACCC 

15 BI823321 

GGCGATGTCGCTCGTGCTGCTAAGCCTGGCCGCGCTGTGCAGGAGCGCCGTACCCCGAGA 
GCCGACCGTTCAATGTGGCTCTGAAACTGGGCCATCTCCAGAGTGGATGCTACAACATGA 
TCTAATCCCGGGAGACTTGAGGGACCTCCGAGTAGAACCTGTTACAACTAGTGTTGCAAC 
AGGGGACTATTCAATTTTGATGAATGTAAGCTGGGTACTCCGGGCAGATGCCAGCATCCG 

20 CTTGTTGAAGGCCACCAAGATTTGTGTGACGGGCAAAAGCAACTTCCAGTCCTACAGCTG 
TGTGAGGTGCAATTACACAGAGGCCTTCCAGACTCAGACCAGACCCTCTGGTGGTAAATG 
GACATTTTCCTATATCGGCTTCCCTGTAGAGCTGAACACAGTCTATTTCATTGGGGCCCA 
TAATATTCCTAATGCAAATATGAATGAAGATGGCCCTTCCATGTCTGTGAATTTCACCTC 
ACCAGGAAGCCTGTGGGATCCGAACATCACTGCTTGTAAGAAAGAATGAGGAGACAGTAG 

25 AAGTGAACTTCACAACCACTCCCCTGGGAAACAGATACATGGCTCTTATCCAACACAGCA 
CTATCATCGGGTTTCTCAGGTGTTTGAGCCACACCAGAAGAAACAAACGCGAGCTTCAGT 
GGTGATTCCAGTGACTGGGGATAGTGAAGGTGCTACGGTGCAGCTGACTCCATATTTTCC 
TACTTGTGGCAGCGACTGCAATCCGACATAAAGGAACAGTTGTGCTCTGCCCACAAACAG 
GCGTCCCTTTCCCTCTTGGATAGCAACAGAAGCAAGCCGGGAGGCTGGTGCCTCTTCTTC 

30 TGGTGTCTCTGCTGGTGGCACATTGAGTGCTGGTGGCAGGATCCATCTAATGTGGAGGCC 
CCAAAGGACCAGGAAAGACTTCCTTTATTAGCACCAAGTATTGCCC 

AA5 14396 

TGGCTGAAGATGCTTTATTGTTGCATTATCAAAATGGTTATAGTTTTCAATTAAAACTGT 
3 5 AATTGATTTCTATGTATAAAACAGCTTTGAAGTTGTAAATGTAGTTTCCAATCGTTAGTT 
AATGCTACATTAGTTAGCAATATTTGAAAATTTTATTGGTATAAAATGTTTTAATTACTA 
AGGCTGTTTGTAGGCTGCATAGTAAGCTTCAGGATCATCACACGTTTTTTCCCTGTAATT 
GGTGGGATAGGAAGCCTTTAAGGTCTCTTGCTTCTCATGGGTGGGCTACAAGGAGCAGCA 
GCCATCGTGGCAGGCTTGTGATCTTTTTCCTGCTGACACCTGCTACTTGACATGGAGAAG 
40 TTCTGCACAGAAAGCAGTGGCATCCTTCATGAGGTGGTACTTGGGGCAGACACTGAGAGC 
ATTGTAATCGTCTTTTGTATCAATCTCTCTAAAGTAGACCACCACGTATTTGTGCAGATG 
AATCTGGC 
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BF 110326 

TTTGTTTGGCTGAAGATGCTTTATTGTTGCATTATCAAAATGGTTATAGTTTTCAATTAA 
AACTGTAATTGATTTCTATGTATAAAACAGCTTTGAAGTTGTAAATGTAGTTTCCAATCG 
5 TTAGTTAATGCTACATTAGTTAGCAATATTTGAAAATTTTATTGGTATAAAATGTTTTAA 
TTACTAAGGCTGTTTGTAGGCTGCATAGTAAGCTTCAGGATCATCACACGTTTTTTCCCT 
GTAATTGGTGGGATAGGAAGCCTTTAAGGTCTCTTGCTTTTCATGGGTGGGCTACAAGGA 
GCAGCAGCCATCGTGGCAGGCTTGTGATCTTTTTCCTGCTGACACCTGCTGCTTGACATG 
GAGAAGTTCTGCACAGAAAGCAGTGGCATCCTTCATGAGGTGGTACTTGGGGCAGACACT 
1 0 GAGAGCATTGTAATCGTCTTTTGTATCAATCTCTCTAAAGTAGACCACCACGTATTTGTG 
CAGATGAATCTGGCTTCTTAGATCACTGC 

BE466508 

TGGCATGAGATGCTATATTGTTGCATTATCAAAATGGGTTTAGTCTTCAATTAACACTGT 
1 5 AATTGATTTCTATGTATAAAACAGCTTTGAAGTTGTAAATGTGGTTTCCAATCGTCAGTT 
AATGCTACATTAGTTAGCAATATTTGAAAATTTTATTGGTATAAAATGTTTTAATTACTA 
AGGCTGTTTGTAGGCTGCATAGTAAGCTTCAGGATCATCACACGTTTTTTCCCTGTAATT 
GGTGGGATAGGAAGCCTTTAAGGTCTCTTGCTTCTCATGGGTGGGCTACAAGGAGCAGCA 
GCCATCGTGGCAGGCTTGTGATCTTTTTCCTGCTGACACCTGCTGCTTGACATGGAGAAG 
20 TTCTGCACAGAAAGCAGTGGCATCCTTCATGAGGTGGTACTTGGGGCAGACACTGAGAGC 
ATTGTAATCGTCTTTTGTATCAATCTCTCTAAAGTAGACCACCACGTATTTGTGCAGATG 
AATCTGGCTTCTTAGATCACTG 

BF740045 

25 GTTTGGCTGAAGATGCTTTATTGTTGCATTATCAAAATGGTTATAGTTTTCAATTAAAAC 
TGTAATTGATTTCTATGTATAAAACACGCTTTGAAGTTGTAAATGTAGTTTCCAATCGTT 
AGTTAATGCTACATTAGTTAGCAATATTTGAAAATTTTATTGGTATAAAATGTTTTAATT 
ACTAAGGCTGTTTGTAGGCTGCATAGTAAGCTTCAGGATCATCACACGTTTTTTCCCTGT 
AATTGGTGGGATAGGAAGCCTTTAAGGTCTCTTGCTTCTCATGGGTGGGCTACAAGGAGC 

30 AGCAGCCATCGTGGCAGGCTTGTGATCTTTTTCCTGCTGACACCTGCTGCTTGACATGGA 
GAAGTTCTGCACAGAAAGCAGTGGCATCCTTCATGAGGTGGTACTTGGGGCAGACACTGA 
GAGCATTGTAATCGTCTTTTGTATCAATCTCTCTAAAGTA 

AW299271 

35 TGGCTGAAGATGCTTTATTGTTGCATTATCAAAATGGTTATAGTTTTCAATTAAAACTGT 
AATTGATTTCTATGTATAAAACAGCGTTGAAGTTGTAAATGTAGTTTCCAATCGTTAGTT 
AATGCTACATTAGTTAGCAATATTTGAAAATTTTATTGGTATAAAATGTTTTAATTACTA 
AGGCTGTTTGTAGGCTGCATAGTAAGCTTCAGGATCATCACACGTTTTTTCCCTGTAATT 
GGTGGGATAGGAAGCCTTTAAGGTCTCTTGCTTCTCATGGGTGGGCTACAAGGAGCAGCA 

40 GCCATCGTGGCAGGCTTGTGATCTTTTTCCTGCTGACACCTGCTGCTTGACATGGAGAAG 
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TTCTGCACAGAAAGCAGTGGCATCCTTCATGAGGTGGTACTTGGGGCAGACACTGAGAGC 
ATTGTAATCGTCTTTTGTATCAATCTCTCTAAAGTAGACCACCACGTATTTGTGCAGATG 
AATCTGGCTTCTTAGATCACTGCAGAAAAG 

5 AA836217 

TTTTTTTTTTACAACTTCAAAGCTGTTTTATACATAGAAATCAATTACAGTTTTAATTGA 
AAACTATAACCATTTTGATAATGCAACAATAAAGCATCTTCAGCCAAACATCTAGTCTTC 
CATAGACCATGCATTGCAGTGTACCCAGAACTGTTTAGCTAATATTCTATGTTTAATTAA 
TGAATACTAACTCTAAGAACCCCTCACTGATTCACTCAATAGCATCTTAAGTGAAAAACC 
1 0 TTCTATTACATGCAAAAAATCATTGTTTTTAAGATAACAAAAGTAGGGAATAAACAAGCT 
GAACCCACTTTTACTGGACCAAATGATCTATTATATGTGTACCACTTGTATGATTTGGTA 
TTTGCATAAGACCTTCCCTCTACAAACTAGATTCATATCTTGATTCTTGTACAGGTGCCT 
TTTAACATGAACAACAAAATACCCACAAACTTGTCTACTTTTGCC 

15 AI203628 

TAGTAATTAAAACATTTTATACCAATAAAATTTTCAAATATTGCTAACTAATGTAGCATT 
AACTAACGATTGGAAACTACATTTACAACTTCAAAGCTGTTTTATACATAGAAATCAATT 
ACAGTTTTAATTGAAAACTATAACCATTTTGATAATGCAACAATAAAGCATCTTCAGCCA 
AACATCTAGTCTTCCATAGACCATGCATTGCAGTGTACCCAGAACTGTTTAGCTAATATT 
20 CTATGTTTAATTAATGAATACTAACTCTAAGAACCCCTCACTGATTCACTCAATAGCATC 
TTAAGTGAAAAACCTTCTATTACATGCAAAAAATCATTGTTTTTAAGATAACAAAAGTAG 
GGAATAAACAAGCTGAACCCACTTTTACTGGACCAAATGATCTATTATATGTGTAACCAC 
TTGTATGATTTGGTATTTGCATAAGACCTTCCCTCTACAAACTAGATTCATATCTTGATT 
CTTGTACAGGTGCCTTTTAACATGAA 

25 

AI627783 

TTTTTTTTTTTTTTTTAGCAATATTTGAAAATTTTATTGGTATAAAATGTTTTAATTACT 
AAGGCTGTTTGTAGGCTGCATAGTAAGCTTCAGGATCATCACACGTTTTTTCCCTGTAAT 
TGGTGGGATAGGAAGCCTTTAAGGTCTCTTGCTTCTCATGGGTGGGCTACAAGGAGCAGC 
30 AGCCATCGTGGCAGGCTTGTGATCTTTTTCCTGCTGACACCTGCTACTTGACATGGAGAA 
GTTCTGCACAGAAAGCAGTGGCATCCTTCATGAGGTGGTACTTGGGGCAGACACTGAGAG 
CATTGTAATCGTCTTTTGTATCAATCTCTCTAAAGTAGACCACCACGTATTTGTGCAGAT 
GAATCTGGCTTCTTAGATCACTGCAGAAAAGGTTAAAGGCAAGGGGGAAGAGGTCTTGAG 
AGTTCTC 

35 

AI744263 

TTAAAGTGGGTTCAGCTTGTTTATTCCCTACTTTTGTTATCTTAAAAACAATGATTTTTT 
GCATGTAATAGAAGGTTTTTCACTTAAGATGCTATTGAGTGAATCAGTGAGGGGTTCTTA 
GAGTTAGTATTCATTAATTAAACATAGAATATTAGCTAAACAGTTCTGGGTACACTGCAA 
40 TGCATGGTCTATGGAAGACTAGATGTTTGGCTGAAGATGCTTTATTGTTGCATTATCAAA 
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ATGGTTACAGTTTTCAATTAAAGCTGTAATTGATTTCTATGTATAAAACAGCTTTGAAGT 
TGTAAATGTAGTTTCCAATCGTTAGTTAATGCTACATTAGTTAGCAATATTTGAAAATTT 
TATTGGTATAAAATGTTTTAATTACTAAGGCTGTTTGTAGGCTGCATAGTAAGCTTCAGG 
ATCATCACACGTTNTTTCCCTGTAATTGGTGGGATAGGAAGCCTTTA 

5 

AI401622 

AGTTAGCAATATTTGAAAATTTTATTGGTATAAAATGTTTTAATTACTAAGGCTGTTTGT 
AGGCTGCATAGTAAGCTTCAGGATCATCACACGTTTTTTCCCTGTAATTGGTGGGATAGG 
AAGCCTTTAAGGTCTCTTGCTTCTCATGGGTGGGCTACAAGGAGCAGCAGCCATCGTGGC 
1 0 AGGCTTGTGATCTTTTTCCTGCTGACACCTGCTACTTGACATGGAGAAGTTCTGCACAGA 
AAGCAGTGGCATCCTTCATGAGGTGGTACTTGGGGCAGACACTGAGAGCATTGTAATCGT 
CTTTTGTATCAATCTCTCTAAAGTAGACCACCACGTATTTGTGCAGATGAATCTGGCTTC 
TTAGATCACTGCAGAAAAGGTTAAAGGCAAGGGGGAAGAGGTCTTGAGAGTTCTCACTGG 

15 AI826949 

TTGGCTGAAGATGCTTTATTGTTGCATTATCAAAATGGTTATAGTTTTCAATTAAAACTG 
TAATTGATTTCTATGTATAAAACAGCTTTGAAGTTGTAAATGTAGTTTCCAATCGTTAGT 
TAATGCTACATTAGTTAGCAATATTTGAAAATTTTATTGGTATAAAATGTTTTAATTACT 
AAGGCTGTTTGTAGGCTTGCATAGAAGCTTCAGGATCATCACACGTTTTTTCCCTGTAAT 
20 TGGTGGGATAGGAAGCCTTTAAGGTCTCTTGCTTCTCATGGGTGGGCTACAAGGAGCAGC 
AGCCATCGTGGCAGGCTTGTGATCTTTTTCCTGCTGACACCTGCTGCTTGACATGGAGAA 
GTTCTGCACAGAAAGCAGTGGCATCCTTCATGAGGTGGTACTTGGGGCAGACACTGAGAG 
CATTGTAATCGTCT 

25 BE047352 

TTTTTTTTTTAGCAATATTTGAAAATTTTATTGGTATAAAATGTTTTAATTACTAAGGCT 
GTTTGTAGGCTGCATAGTAAGCTTCAGGATCATCACACGTTTTTTCCCTGTAATTGGTGG 
GATAGGAAGCCTTTAAGGTCTCTTGCTTCTCATGGGTGGGCTACAAGGAGCAGCAGCCAT 
CGTGGCAGGCTTGTGATCTTTTTCCTGCTGACACCTGCTACTTGACATGGAGAAGTTCTG 
30 CACAGAAAGCAGTGGCATCCTTCATGAGGTGGTACTTGGGGCAGACACTGAGAGCATTGT 
AATCGTCTTTTGTATCAATCTCTCTAAAGTAGACCACCACGTATTTGTGCAGATGAATCT 
GGCTTCTTAGATCACTGCAGAAAAGGTTAAAGGCAAGGGGGAAGAGGTCTTGAGAG 

AI9 11549 

35 TTTGGCTGAAGATGCTTTATTGTTGCATTATCAAAATGGTTACAGTTTTCAATTAAAGCT 
GTAATTGATTTCTATGTATAAAACAGCTTTGAAGTTGTAAATGTAGTTTCCAATCGTTAG 
TTAATGCTACATTAGTTAGCAATATTTGAAAATTTTATTGGTATAAAATGTTTTAATTAC 
TAAGGCTGTTTGTAGGCTGCATAGTAAGCTTCAGGATCATCACACGTTTTTTCCCTGTAA 
TTGGTGGGATAGGAAGCCTTTAAGGTCTCTTGCTTCTCATGGGTGGGCTACAAGGAGCAG 

40 CAGCCATCGTGGCAGGCTTGTGATCTTTTTCCTGCTGACACCTGCTGCTTGACATGGAGA 
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AGTTCTGCACAGAAAGCAGTGGCATCCTTCATGAGGTGGTACTTGGGGCAGACA 



BF1 94822 

TTCTCTGGCTGAAGATGCTTTATTGTTGCATTATCAAAATGGTTACAGTTTTCAATTAAA 
5 GCTGTAATTGATTTCTATGTATAAAACAGCTTTGAAGTTGTAAATGTAGTTTCCAATCGT 
TAGTTAATGCTACATTAGTTAGCAATATTTGAAAATTTTATTGGTATAAAATGTTTTAAT 
TACTAAGGCTGTTTGTAGGCTGCATAGTAAGCTTCAGGATCATCACACGTTTTTTCCCTG 
TAATTGGTGGGATAGGAAGCCTTTAAGGTCTCTTGCTTCTCATGGGTGGGCTACAAGGAG 
CAGCAGCCATCGTGGCAGGCTTGTGATCTTTTTCCTGCTGACACCTGCTGCTTGACATGG 
1 0 AGAAGTTCTGCACAGAAAGCAGTGGCATCCTTCATGAGGTGGTACTTGG 



AI034244 

TTTTTTTTTTTTTTTTACAACCTTGAAAGCTGTTTTATACATAGAAATCAATTACAGTTT 
TAATTGAAAACTATAACCATTTTGATAATGCAACAATAAAGCATCTTCAGCCAAACATCT 
1 5 AGTCTTCCATAGACCATGCATTGCAGTGTACCCAGAACTGTTTAGCTAATATTCTATGTT 
TAATTAATGAATACTAACTCTAAGAACCCCTCACTGATTCACTCAATAGCATCTTAAGTG 
AAAAACCTTCTATTACATGCAAAAAATCATTGTTTTTAAGATAACAAAAGTAGGGAATAA 
ACAAGCTGAACCCACTTTTACTGGACCAAATGATCTATTATATGTGTAACCACTTGTATG 
ATTTGGATTTGCATAAGACCTTCCCTCTACAAACTAGATTCATATCTTGATTCT 

20 

AI033911 

TTTTTTTTTTTTTTTTACAACTGCAAAGCTGTTTTATACATAGAAATCAATTACAGTTTT 
AATTGAAAACTATAACCATTTTGATAATGCAACAATAAAGCATCTTCAGCCAAACATCTA 
GTCTTCCATAGACCATGCATTGCAGTGTACCCAGAACTGTTTAGCTAATATTCTATGTTT 
25 AATTAATGAATACTAACTCTAAGAACCCCTCACTGATTCACTCAATAGCATCTTAAGTGA 
AAAACCTTCTATTACATGCAAAAAATCATTGTTTTTAAGATAACAAAAGTAGGGAATAAA 
CAAGCTGAACCCACTTTTACTGGACCAAATGATCTATTATATGTGTAACCACTTGTATGA 
TTTGGTATTTGCATAAGACCTTCCCTCTACAAACTAGATTCATATCTTGATTCT 

30 BF064177 

TTTTTTAGTTAGCAATATTTGAAAATTTTATTGGTATAAAATGTTTTAATTACTAAGGCT 
GTTTGTAGGCTGCATAGTAAGCTTCAGGATCATCACACGTTTTTTCCCTGTAATTGGTGG 
GATAGGAAGCCTTTAAGGTCTCTTGCTTCTCATGGGTGGGCTACAAGGAGCAGCAGCCAT 
CGTGGCAGGCTTGTGATCTTTTTCCTGCTGACACCTGCTACTTGACATGGAGAAGTTCTG 
35 CACAGAAAGCAGTGGCATCCTTCATGAGGTGGTACTTGGGGCAGACACTGAGAGCATTGT 
AATCGTCTTTTGTATCAATCTCTCTAAAGTAGACCACCACGTATTTGTGCAGATGAATCT 
GGCTTCTTAGATCACTGCAGAAAAGGTTAAAGGCAAGGGGGAAGAGGTCTTGAGAGTTCT 
CACTGGGACTGCCCTCGCTCTTGCCACAGGTACCATCGCACACACTGTTGACGTCATTGG 
AAAG 

40 
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AA847767 

GGCTGAAGATGCTTTATTGTTGCATTATCAAAATGGTTATAGTTTTCAATTAAAACTGTA 
ATTGATTTCTATGTATAAAACAGCTTTGAAGTTGTAAATGTAGTTTCCAATCGTTAGTTA 
ATGCTACATTAGTTAGCAATATTTGAAAATTTTATTGGTATAAAATGTTTTAATTACTAA 
5 GGCTGTTTGTAGGCTGCATAGTAAGCTTCAGGATCATCACACGTTTTTTCCCTGTAATTG 
GTGGGATAGGAAGCCTTTAAGGTCTCTTGCTTCTCATGGGTGGGCTACAAGGAGCAGCAG 
CCATCGTGGCAGGCTTGTGATCTTTTTCCTGCTGACACCTGCTGCTTGACATGGAGAAGT 
TCTGCACAGAAAGCAGTGGCATCCTTCATGAGGTGGTA 

10 AI538624 

TTGGCTGAAGATGCTTTATTGTTGCATTATCAAAATGGTTACAGTTTTCAATTAAAGCTG 
TAATTGATTTCTATGTATAAAACAGCTTTGAAGTTGTAAATGTAGTTTCCAATCGTTAGT 
TAATGCTACATTAGTTAGCAATATTTGAAAATTTTATTGGTATAAAATGTTTTAATTACT 
AAGGCTGTTTGTAGGCTGCATAGTAAGCTTCAGGATCATCACACGT.TTTTTCCCTGTAAT 
1 5 TGGTGGGATAGGAAGCCTTTAAGGTCTCTTGCTTCTCATGGGTGGGCTACAAGGAGCAGC 
AGCCATCGTGGCAGGCTTGTGATCTTTTTCCTGCTGACACCTGCTGCTTGACATGGAGAA 
GTTCTGCACAGAAAGCAGTGGCATCCTTCATGAGGTGGTAC 

AI913613 

20 TTGGCTGAAGATGCTTTATTGTTGCATTATCAAAATGGTTATAGTTTTCAATTAAAACTG 
TAATTGATTTCTATGTATAAAACAGCTTTGAAGTTGTAAATGTAGTTTCCAATCGTTAGT 
TAATGCTACATTAGTTAGCAATATTTGAAAATTTTATTGGTATAAAATGTTTTAATTACT 
AAGGCTGTTTGTAGGCTGCATAGTAAGCTTCAGGATCATCACACGTTNTTTCCCTGTAAT 
TGGTGGGATAGGAAGCCTTTAAGGTCTCTTGCTTCTCATGGGTGGGCTACAAGGAGCAGC 

25 AGCCATCGTGGCAGGCTTGTGATCTTTTTCCTGCTGACACCTGCTGCTTGACATGGAGAA 
GTTCTGCACAGAAAGCAGTGGCATCCTTCATG 

AI942234 

GTTTGGCTGAAGATGCTTTATTGTTGCATTATCAAAATGGTTATAGTTTTCAATTAAAAC 
30 TGTAATTGATTTCTATGTATAAAACAGCTTTGAAGTTGTAAATGTAGTTTCCAATCGTTA 
GTTAATGCTACATTAGTTAGCAATATTTGAAAATTTTATTGGTATAAAATGTTTTAATTA 
CTAAGGCTGTTTGTAGGCTGCATAGTAAGCTTCAGGATCATCACACGTTTTTTCCCTGTA 
ATTGGTGGGATAGGAAGCCTTTAAGGTCTCTTGCTTCTCATGGGTGGGCTACAAGGAGCA 
GCAGCCATCGTGGCAGCTTGGTGATCTTTTTCCTGCTGACACCTGCTGCTTGACATGAAG 
35 AAGTTCTGCACAGAAAGCAGTGGCAT 

AI5 80483 

GTTTGGCTGAAGATGCTTTATTGTTGCATTATCAAAATGGTTATAGTTTTCAATTAAAAC 
TGTAATTGATTTCTATGTATAAAACAGCTTTGAAGTTGTAAATGTAGTTTCCAATCGTTA 
40 GTTAATGCTACATTAGTTAGCAATATTTGAAAATTTTATTGGTATAAAATGTTTTAATTA 
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CTAAGGCTGTTTGTAGGCTGCATAGTAAGCTTCAGGATCATCACACGTTTTTTCCCTGTA 
ATTGGTGGGATAGGAAGCCTTTAAGGTCTCTTGCTTCTCATGGGTGGGCTACAAGGAGCA 
GCAGCCATCGTGGCAGGCTTGGATCTTTTTCCTGCTGACACCTGCTGCTTGACATTGGAA 
AGTTCTGCACAGAAAGCAGTGGCATC 

5 

AI831909 

TTTTGGCTGATGATGCTTTATTGTTGCATTATCAAAATGGTTACAGTTTTCAATTAAAGC 
TGTAATTGATTTCTATGTATAAAACAGCTTTGAAGTTGTAAATGTAGTTTCCAATCGTTA 
GTTAATGCTACATTAGTTAGCAATATTTGAAAATTTTATTGGTATAAAATGTTTTAATTA 
1 0 CTAAGGCTGTTTGTAGGCTGCATAGTAAGCTTCAGGATCATCACACGTTTTTTCCCTGTA 
ATTGGTGGGATAGGAAGCCTTTAAGGTCTCTTGCTTCTCATGGGTGGGCTACAAGGAGCA 
GCAGCCATCGTGGCAGGCTTGTGATCTTTTTCCTGCTGACACCTGCTGCTTGACATGGAG 
AAGTTCTGCACAGAAAGCAGTGGCAT 

15 AI672344 

GGCTGAAGATGCTTTATTGTTGCATTATCAAAATGGTTACAGTTTTCAATTAAAGCTGTA 
ATTGATTTCTATGTATAAAACAGCTTTGAAGTTGTAAATGTAGTTTCCAATCGTTAGTTA 
ATGCTACATTAGTTAGCAATATTTGAAAATTTTATTGGTATAAAATGTTTTAATTACTAA 
GGCTGTTTGTAGGCTGCATAGTAAGCTTCAGGATCATCACACGTTTTTTCCCTGTAATTG 
20 GTGGGATAGGAAGCCTTTAAGGTCTCTTGCTTCTCATGGGTGGGCTACAAGGAGCAGCAG 
CCATCGTGGCAGGCTTGTGATCTTTTTCCTGCTGACACCTGCTGCTTGACATGGAGAAGT 
TCTGCACAGAAAG 

AW025192 

25 GATTGGCTGTTTTATGCTTTATTGTTGCATTATCAAAATGGTTATAGTTTTCAATTAAAA 
CTGTAATTGATTTCTATGTATAAAACAGCTTTGAAGTTGTAAATGTAGTTTCCAATCGTT 
AGTTAATGCTACATTAGTTAGCAATATTTGAAAATTTTATTGGTATAAAATGTTTTAATT 
ACTAAGGCTGTTTGTAGGCTGCATAGTAAGCTTCAGGATCATCACACGTTTTTTCCCTGT 
TATTGGTGGGATAGGAAGCCTTTAAGGTCTCTTGCTTCTCATGGGTGGGCTACAAGGAGC 

30 AGCAGCCATCGTGGCAGGCTTGTGATCTTTTTCCTGCTGACACCTGCTGCTTGACATGGA 
GAAGTTCTGCACAAAAAGCAGTGGCATCCTTCATGAGGTGGTA 

AA677205 

GCAATATTTTAAAATTTTATTGGTATAAAATGTTTTAATTACTAAGGCTGTTTGTAGGCT 
35 GCATAGTAAGCTTCAGGATCATCACACGTTTTTTCCCTGTAATTGGTGGCATAGGAAGCC 
TTTAAGGTCTCTTGCTTCTCATGGTGTGGGCTACAAGGAGCAGCAGCCATCGTGGCAGGC 
TTGTGATCTTTTTCCTGCTGACACCTGCTGCTTGACATGGAGAAGTTCTGCACAGAAAGC 
AGTGGCATCCTTCATGAGGTGGTACTTGGGGCAGACACTGAGAGCATTGTAATCGTCTTT 
TGTATCAATCTCTCTAAAGTAGACCACCACGTATTTGTGCAGATGAATCTGGCTTCTTAG 
40 ATCACTGCAGAAAAGGTTAAAGGCAAGGGGGAAGAGGTCTTGAGAGTTCTCACTGGGACT 
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GCCCTCGCTCTTGCCACAGGTACCATCGCACACACTG 
AA721647 

TTTTTTTTTTACAACTTCAAAGCTGTTTTATACATAGAAATCAATTACAGTTTTAATTGA 
5 AAACTATAACCATTTTGATAATGCAACAATAAAGCATCTTCAGCCAAACATCTAGTCTTC 
CATAGACCATGCATTGCAGTGTACCCAGAACTGTTTAGCTAATATTCTATGTTTAATTAA 
TGAATACTAACTCTAAGAACCCCTCACTGATTCACTCAATAGCATCTTAAGTGAAAAACC 
TTCTATTACATGCAAAAAATCATTGTTTTTAAGATAACAAAAGTAGGGAATAAACAAGCT 
GAACCCACTTTTACTGGACCAAATGATCTATTATATGTGTAACCACTTGTATGATTTGGT 
10 ATTTG 

BF115018 

GTTTCGCTGAAGATGCTTTATTGTTGCATTATCAAAATGGTTATAGTTTTCAATTAAAAC 
TGTAATTGATTTCTATGTATAAAACAGCTTTGAAGTTGTAAATGTAGTTTCCAATCGTTA 
1 5 GTTAATGCTACATTAGTTAGCAATATTTGAAAATTTTATTGGTATAAAATGTTTTAATTA 
CTAAGGCTGTTTGTAGGCTGCATAGTAAGCTTAAGGCCCATCACACGTTTTTTCCCTGTA 
ATTGGTGGGATAGGAAGCCTTTAAGGTCTCTTGCTTNTCATGGGTGGGCTACAAGGAGCA 
GCAGCCATCGTGGCAGGCTTGNGATCTTTTTCCTGCTGGCCCCTGCTGCTTGACAT 

20 W61238 

NAAAGCACTGGCTGAAGGAAGCCAAGAGGATCACTGCTGCTCCTTTTTTCTAGAGGAAAT 
GTTTGTCTACGTGGTAAGATATGACCTAGCCCTTTTAGGTAAGCGAACTGGTATGTTAGT 
AACGTGTACAAAGTTTAGGTTCAGACCCCGGGAGTCTTGGGCACGTGGGTCTCGGGTCAC 
TGGTTTTGACTTTAGGGCTTTGTTACAGATGTGTGACCAAGGGGAAAATGTGCATGACAA 
25 CACTAGAGGTATGGGCGACACGANAACGAACGGGAAGTTTTGGCTGAAGTAGGAGTCTTG 
GTGAGATTTTGCTCTGATGCATGGTGTGAACTTTCTGAGCCTCTTGTTTTTCCTCAAGCT 
GACTCCATATTTTCCTACTTGTGGCAGCGACTGCATCCGACATAAAGGAACAG 

W61239 

30 TAGCAATATTTGAAAATTTTATTGGTATAAAATGTTTTAATTACTAAGGCTGTTTGTAGG 
CTGCATAGTAAGCTTCAGGATCATCACACGTTTTTTCCCTGTAATTGGTGGGATAGGAAG 
CCTTTAAGGTCTCTTGCTTCTCATGGGTGGGCTACAAGGAGCAGCAGCCATCGTGGCAGG 
CTTGTGATCTTTTTCCTGCTGACACCTGCTGCTTGACATGGAGAAGTTCTGCACAGAAAG 
CAGTGGCATCCTTCATGAGGTGGTACTTGGGGCAGACACTGAGAGCATTGTAATCGTCTT 

35 TTGTATCAATCTCTCTAAAGTAGACCACCACGTATTTGTGCAGATGAATCTGGCTTCTTA 
GATCACTGCAGAAAAGGTTAAAGGCAAGGGGGGA 

AI032064 

AGCAATATTTGAAAATTTTATTGGTATAAAATGTTTTAATTACTAAGGCTGTTTGTAGGC 
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TGCATAGTAAGCTTCAGGATCATCACACGTTTTTTCCCTGTAATTGGTGGCATAGGAAGC 
CTTTAAGGTCTCTTGCTTCTCATGGGTGGGCTACAAGGAGCAGCAGCCATCGTGGCAGGC 
TTGTGATCTTTTTCCTGCTGACACCTGCTGCTTGACATGGAGAAGTTCTGCACAGAAAGC 
AGTGGCATCCTTCATGAGGTGGTACTTGGGGCAGACACTGAGAGCATTGTAATCGTCTTT 
5 TGTATCAATCTCTCTAAAGTAGACCACCACGTATTTGTGCAGATGAATCTGGCTTCTTAG 
ATCACTGCAGAAAAGGTTAAAGGCAAGGGGGAAGAGGTCTTGAGAGTTCTCACTGGGACT 
GCCCTCGCTCTTGCCAC 

AW236941 

1 0 TTTTTTTTTTTAGCAATATTTGAAAATTTTATTGGTATAAAATGTTTTAATTACTAAGGC 
TGTTTGTAGGCTGCATAGTAAGCTTCAGGATCATCACACGTTTTTTCCCTGTAATTGGTG 
GGATAGGAAGCCTTTAAGGTCTCTTGCTTCTCATGGGTGGGCTACAAGGAGCAGCAGCCA 
TCGTGGCAGGCTTGTGATCTTTTTCCTGCTGACACCTGCTGCTTGACATGGAGAAGTTCT 
GCACAAAAAGCAGTGGCATCCTTCATGAGGTGGTACTTGGGGCAGACACTGAGAGCATTG 

15 TAATCGTCTTTTGTATCAATC 

AW236941 

TTTTTTTTTTTAGCAATATTTGAAAATTTTATTGGTATAAAATGTTTTAATTACTAAGGC 
TGTTTGTAGGCTGCATAGTAAGCTTCAGGATCATCACACGTTTTTTCCCTGTAATTGGTG 
20 GGATAGGAAGCCTTTAAGGTCTCTTGCTTCTCATGGGTGGGCTACAAGGAGCAGCAGCCA 
TCGTGGCAGGCTTGTGATCTTTTTCCTGCTGACACCTGCTGCTTGACATGGAGAAGTTCT 
GCACAAAAAGCAGTGGCATCCTTCATGAGGTGGTACTTGGGGCAGACACTGAGAGCATTG 
TAATCGTCTTTTGTATCAATC 

25 BG057174 

TTTTATACATAGAAATCAATTACAGCTTTAATTGAAAACTATAACCATTTTGATAATGCA 
ACAATAAAGCATCTTCAGCCAAACATCTAGTCTTCCATAGACCATGCATTGCAGTGTACC 
• CAGAACTGTTTAGCTAATATTCTATGTTTAATTAATGAATACTAACTCTAAGAACCCCTC 
ACTGATTCACTCAATAGCATCTTAAGTGAAAAACCTTCTATTACATGCAAAAAATCATTG 
30 TTTTTAAGATAACAAAAGTAGGGAATAAACAAGCTGAACCCACTTTTACTGGACCAAATG 
ATCTATTATATGTG 

AW058532 

GGCTGAAGATGCTTTATTGTTGCATTATCAAAATGGTTATAGTTTTCAATTAAAACTGTA 
35 ATTGATTTCTATGTATAAAACAGCTTTGAAGTTGTAAATGTAGTTTCCAATCGTTAGTTA 
ATGCTACATTAGTTAGCAATATTTGAAAATTTTATTGGTATAAAATGTTTTAATTACTAA 
GGCTGTTTGTAGGCTGCATAGTAAGCTTCAGGATCATCACACGTTTTTTCCCCTGTATGG 
GTGGGATAGGAAGCCTTTAAGGTCTCTTGCTTCTCATGGGTGGGCT 

40 T98360 
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TNAGGAANGAGAAGAAGCGAGATNNANNTNNAGAAATANGTGGTGGCNTANTTTAGAGAG 
ATTGATNCAAAAGCNGATTNCAATNNNCTCAGTGNCTNCCCAAGTNCCNCCTCATGAAGG 
ATNCACTNCTTTCTGTGCAGACTNNNCATGTCAAGCAGCAGGTGTCAGCAGGAAAAAGAN 
CACAAGCTCCNCGATGGCTGCTGCTCCTTGTAGCCCNCCATGAGAAGCAAGAGNCTTAAA 
5 GGCTTCCTATCCCACCAATTACAGGGAAAAACGTGTGATGACCTGAGCTTACTATGCAGC 
CTACAANCAGCCTTAGTAATTAAACCNTTTATT 

T98361 

NANNATGAAGATGCTTTATTGTTGCATTATCAAAATGGTTACAGTTTTCAATTAAA.GCTG 
1 0 TAATTGATTTCTATGTATAAAACAGCTTTGAAGTTGTAAATGTAGTTTCCAATCGTTAGT 
TAATGCTACATTAGTTAGCAATATTTGAAAATTTTATTGGNATAAAATGTTTTAATTACT 
AAGGCTGTTTGTAGGCTGCATAGTAAGCTTCAGGATCATCACACGTTTTTNCCCTGTAAT 
TGGGTGGGGATAGGGAAGCCCTTTAAGGGTCTCTTGCTTCTCATGGGGTGGGGCCTACNA 
AGGGAGCAGCCAGCCCATCGTGGCCAGGGCCTTGTGGANCCTTTTTCCCTGCCTGGACAC 
1 5 CCTGCCTGCCTTGGACCATGGGGAGGAAGGTTCTGGCACCAGGAAAGCCAGGTGGCCCAT 
CCCTTCCATGAGGGTGGGGTACTTNGGGGGGCCAGGACCACTGAGGNGCCATTGGTAATC 
CGTCCTTTTNGTATCCAATCCCCTCCTAAGGTAGGNCCCCCC 

AI470845 

20 TTTTGTGGGTTCAGCTTGTTTATTCCCTACTTTTGTTATCTTAAAAACAATGATTTTTTG 
CATGTAATAGAAGGTTTTTCACTTAAGATGCTATTGAGTGAATCAGTGAGGGGTTCTTAG 
AGTTAGTATTCATTAATTAAACATAGAATATTAGCTAAACAGTTCTGGGTACACTGCAAT 
GCATGGTCTATGGAAGACTAGATGTTTGGCTGAAGATGCTTTTATTGTTGCATTATCAAN 
ATGGTTTATAGTTTTCAATTAAAACTGTAATTGATTT 

25 

AI497731 

GGCTGAAGATGCTTTATTGTTGCATTATCAAAATGGTTATAGTTTTCAATTAAAACTGTA 
ATTGATTTCTATGTATAAAACAGCTTTGAAGTTGTAAATGTAGTTTCCAATCGTTAGTTA 
ATGCTACATTAGTTAGCAATATTTGAAAATTTTATTGGTATAAAATGTTTTAATTACTAA 
30 GGCTGTTTGTAGGCTGCATAGTAAGCTTAANGATCATACNCACGTTTTTCCCTGAATTTG 
GTGGGATAANGAAGCCTTTAAAGGT 

T96629 

TTGAAAATTTTATTGGNATAAAATGTTTTAATTACTAAGGCTGTTTGTAGGCTGCATAGT 
35 AAGCTTCAGGANCATCACACGTTTTTTCCCTGTAATTGGTGGCATAGGAAGCCTTTAAGG 
TCTCTTGCTTCTCATGGGTGGGCTACAAGGAGCAGCAGCCATCGTGGCAGGCTTGTGATC 
TTTTTCCTGCTGACACCTGCTGCTTGACATGGAGAAGTTCTGCACAGAAAGCAGTGGCAT 
CCTTCATGAGGTGGTACTTGGGGCAGACACTGAGAGCATTGTAATCGTCTTTTGTATCAA 
TCTCTCTAAAGTAGACCACCACCGTNTTTGTGCAGATGGANTCTGGCTTC 

40 
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T96740 

AGGCACTATCATCGGGTTTTCTCAGGTGTTTGAGCCACACCAGAAGAAACAAACGCGAGC 
TTCAGTGGTGATTCCAGTGACTGGGGATAGTGAAGGTGCTACGGTGCAGCTGACTCCATA 
TTTTCCTACTTGTGGCAGCGACTGCATCCGACATAAAGGAACAGTTGTGCTCTGCCCACA 
5 AACAGGCGTCCCTTTCCCTCTGGATAACAACAAAAGCAAGCCGGGANGGNCTGNCCTCTC 
CTCCTGCTGTCTCTGCTGGTGGCCACATGGGTGCTGGTGGCAGGGATCTATCTAATGTGG 
AGGCACGAAAGGATCAAGAAGACTTCCTTTTCTAACCACCACATTACTGCCCCCCATTTA 
AGGTTCTTGTGGTTTTACCCATCTGGAAATATGTTTTCCCTTCACACATTTGTTTATTTC 
ATTGATTTNTTTCAAAACCTTGGCAGGAGTTT 

10 

H25975 

GGGTCCAGTGCAGTGGCTTGCNTGCAGAAAGAAGGCAGCAGACAAAGTCGTCTTCCTTCT 
TTCCAATGACGTCAACAGTGTGTGCGATGGTACCTGTGGCAAGAGCGAGGGCAGTCCCAG 
TGAGAACTCTCAAGACCTCTTCCCCCTTGCCTTTAACCTTTTCTGCAGTGATCTAAGAAG 
1 5 CCAGATTCATCTGCACAAATACGTGGTGGTCTACTTTAGAGAGATTGATACAAAAGACGA 
TTACAATGCTCTCAGTGTCTGCCCCAAGTACCACCTCATGAAGGATGCCACTGCTTTCTG 
TGCAGAACTTCTCCATGTCAAGCAGCAGGTGTCAGCAGGAAAAAGATTCACAAGCCTGCC 
ACGATGGCTGCTTGCTTCCTTTGTAGCCCACCCATGAGGAAGNCAAGAGACCTTNAAAGG 
GTTCCTTTTCCCATCANTTTACAGGGGANAAAACGTGTGATGATC 

20 

H25941 

TTTTGTTTGGCTNATNTNNTTCTTATTGTTGCATTATCAAAATGGTTATAGTTTTCAATT 
AAAACTGTAATTGATTNCTATGTATAAAACAGCTTTGAAGTTGTAAATGTAGTTTCCAAT 
CGTTAGTTAATGCTACATTAGTTAGCAATATTTGAAAATTTTATTGGTATAAAANGTTTT 
25 AATTACTAAGGCTGTTTGTAGGCTGCATAGTAAGCTTCAGGATCATCACACGTTTTTCCC 
CTGTAATTGGTGGGATAGGAAGCCTTTAAGGTCTCTNGCTTCTCATGGGTGGGCTACAAG 
GAGCAGCAGCCATCGTGGCAGGCTTGTGANCTTTTNCCTGCTGACACCTGCTGCTTGACA 
TGGGAGAAGTTCTGCACAGAAAGGCAGTGGGCATCCTTCATGAGGTGGGTACTTGGGGGN 
CAGACACTGAGGAGCATTGT 

30 

BE539514 

ACTCAAAAGAAGGCAGCAGACAAAGTCGTCTTCCTTCTTTCCAATGACGTCAACAGTGTG 
TGCGATGGTACCTGTGGCAAGAGCGAGGGCAGTCCCAGTGAGAACTCTCAAGACCTCTTC 
CCCCTTGCCTTTAACCTTTTCTGCAGTGATCTAAGAAGCCAGATTCATCTGCACAAATAC 

35 GTGGTGGTCTACTTTAGAGAGATTGATACAAAAGACGATTACAGTGCTCTCAGTGTCTGC 
CCCAAGTACCACCTCATGAAGGATGCCACTGCTTTCTGTGCAGAACTTCTCCATGTCAAG 
CAGCAGGTGTCAGCAGGAAAAAGATCACAAGCCTGCCACGATGGCCGCTGCTCCTTGTAG 
CCCACCCATGAGAAGCAAGAGACCTTAAAGGCTTCCTATCCCACCAATTACAGGGAAAAA 
ACGTGTGATGATCCTGAAGCTTACTATGCAGCCTACAAACAGCCTTAGTAATTAAAACAT 

40 TTTATACCAATAAAATTTTCAAATATGCTAACTAATGTAGCATTAACTAACGATTGGAAA 
CTACATTTACAACTTCAAAGCTGTTTTATACATAGAAATCAATTACAGCTTTAATTGAAA 
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ACTGTAACCATTTTGATAATGCAACAATAAAGCATCTTCAG 
BX282554 

GTCCAGTGCAGTGGCTTGCCACTCAAAAGAAGGCAGCAGACAAAGTCGTCTTCCTTCTTT 
5 CCAATGACGTCAACAGTGTGTGCGATGGTACCTGTGGCAAGAGCGAGGGCAGTCCCAGTG 
AGAACTCTCAAGACCTCTTCCCCCTTGCCTTTAACCTTTTCTGCAGTGATCTAAGAAGCC 
AGATTCATCTGCACAAATACGTGGTGGTCTACTTTAGAGAGATTGATACAAAAGACGATT 
ACAGTGCTCTCAGTGTCTGCCCCAAGTACCACCTCATGAAGGATGCCACTGCTTTCTGTG 
CAGAACTTCTCCATGTCAAGCAGCAGGTGTCAGCAGGAAAAAGATCACAAGCCTGCCACG 
10 ATGGCCGCTGCTCCTTGTAGCCCACCCATGAGAAGCAAGAGACCTTAAAGGCTTCCTATC 
CCACCAATTACAGGGGAAAAAACGTGTGATGATCCTGAAGCTTACTAT 

R74038 

TATTGTTGCATTATCAAAATGGTTATAGTTTTCAATTAAAACTGTAATTGATTTCTATGT 
15 ATAAAACAGCTTTGAAGTTGTAAATGTAGTTTCCAATCGTTAGTTAATGCTACATTAGTT 
AGCAATATTTGAAAATTTTATTGGTATAAAATGTTTTAATTACTAAGGCTGTTTGTAGGC 
TGCATAGTAAGCTTCAGGATCATCACACGTTTTTNCCCTGTAATTGGGTGGGGATAGGGA 
AGCCTTTAAGGTCTCTTGCTTCTCATGGGGTGGGGCTACAAGGGAGGCAGGCAGCCATCG 
TGGGCAGGGCTTGTGATCTTTTTCCCTGCTGACACCTGCTGCTTGACATGGGGGGAAGGT 
20 TCTGGCACAGAAAGCAGTGGGCATCCTTCATGAGGGTGGTACTTGGGGGGCAGACACTGA 
GGAGGCNTTGTAAATCGNCTTTTTNGTATCCAANCTCTNCTAAAGTAGGGNCCACCNCGT 
TTTTTNTTGCAGGTGGATNCGGGGCTN 

R74129 

25 GGGTCCAGTGCAGTGGCTTGCNTNCAAAAGAAGGCAGCAGACAAAGTCGTCTTCCTTCTT 
TCCAATGACGTCAACAGTGTGTGCGATGGTACCTGTGGCAAGAGCGAGGGCAGTCCCAGT 
GAGAACTCTCAAGACCTCTTCCCCCTTGCCTTTAACCTTTTCTGCAGTGATCTAAGAAGC 
CAGATTCATCTGCACAAATACGTGGTGGTCTACTTTAGAGAGATTGATACAAAAGACGAT 
TACAATGCTCTCAGTGTCTGCCCCAAGTACCACCTCATGAAGGATGCGACTGCTTTCTGT 

30 GCAGAACTTCTCCATGTCAAGCAGCAGGTGTCAGCAGGAAAAAGATCACAAGCCTGCCAC 
GATNGCTGCTGCTCCTTGTAGNCCACCCATGAGAAGCAAGTGACCTTTAAAGGNTTTCCT 
ATTNCCACCNATTTACAGGG 

BG433769 

35 GACTAGATGTTTGGCTGAAGATGCTTTATTGTTGCATTATCAAAATGGTTATAGTTTTCA 
ATTAAAACTGTAATTGATTTCTATGTATAAAACAGCTTTGAAGTTGTAAATGTAGTTTCC 
AATCGTTAGTTAATGCTACATTAGTTAGCAATATTTGAAAATTTTATTGGTATAAAATGT 
TTTAATTACTAAGGCTGTTTGTAGGCTGCATAGTAAGCTTCAGGATCATCACACGTTTTT 
TCCCTGTAATTGGTGGGATAGGAAGCCTTTAAGGTCTCTTGCTTCTCATGGGTGGGCTAC 

40 AAGGAGCAGCAGCCATCGTGGCAGGCTTGTGATCTTTTTCCTGCTGACACCTGCTGCTTG 
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ACATGGAGAAGTTCTGCACAGAAAGCAGTGGCATCCTTCATGAGGTGGTACTTGGGGCAG 
ACACTGAGAGCATTGTAATCGTCTTTTGTATCAATCTCTCTAAAGTAGACCACCACGTAT 
TTGTGCAGATGAATCTGGCTTCTTAGATCACTGCAGAAAAGGTTAAAGGCAAGGGGGAAG 
AGGTCTTGAGAGTTCTCACTGGGACTGCCCTCGCTCTTGCCACAGGTACCATCGCACACA 
5 CTGTTGACGTCATTGGAAAAGAAGGAAGAC 

BG530489 

GAGTTCTCACTGGGACTGCCCTCGCTCTTGCCACAGGTACCATCGCACACACTGTTGACG 
TCATTGGAAAGAAGGAAGACGACCTTGTCTGCTACCTTCTTTTGAGTGGCAAGCCACTGC 
ACTGGACCCATCTCTGCTATTTTCTTTTTCTGCCACTTTTCAAGGATGACCTCACTTCTG 
CAATGGTTTTGAAGAAATTCAGTGAAGTAACAAATTGTGTGATGGAAACATATTTCAGAT 
GGGTAAACCACAAGAACCTTAATGGGGGGCAGTAGTGTGGTGGTAGAAAAGGAAGTCTTC 
TTGATCCTTTCTGTGAGAGGAGAAAAGCATTTGTTATCTGTGAATAGCAAACAGCAGGCT 
TTCACTCTGTAAACCATCCCTGACAAATGATCCCTTGCTAGAGAATGTCAGCTGAGCACC 
AAGGGCCTTGTTAGTGACAGCAAGGAAAAACATCCTGATGTTCCTTTTGAACACATCACC 
TGAAACACACTGATGCTTAAACCTTAACTTTTTTTTTTTGGGGGACATAGTCTCACTCTG 
TCGCCCAGGCTGGAGTGCGTGGGAGAGGACCTCGGAAAGACTGGCAAGCATCCGCATACA 
AGGGAGTAACAGCACAATACTCCGTGAACTTCGGAGCCCTCCAAAGGAATACTCAAGGGC 
GGGTAAAGGATGGCAAGGGTCGACGGAGAGCCCACGAGGAGAGCGGAAGGTAGAGAGGAG 
ACAAGCATAAGACGCGAGAGGAACTCCAAGGCGGGGCCAAAGAGAGAAACCACGGTCACC 
AACAGAAG 

AA007528 

AGAAGCCAGATTCATCTGCACAAATACGTGGTGNTCTACTTTAGAGAGATTGATACAAAA 
GACGATTACAATGCTCTCAGTGTCTGCCCCAAGTACCACCTCATGAAGGATGCCACTGCT 
TTCTGTGCAGAACTTCTCCATGTCAAGCAGCAGGTGTCAGCAGGAAAAAGATCACAAGCC 
TGCCACGATGGCTGCTGCTCCTTGTAGCCCACCCATGAGAAGCAAGAGACCTTAAAGGCT 
TCCTATCCCACCAATTACAGGGNAAAAACNGTAGTGATNATCCCTGACAGCTTACTATGC 
CAGCCNT 

AA007529 

TTGGCTGAAGATGCTTTATTGTTGCATTATCAAAATCGGTTACAGTTTTCAATTAAAGCT 
GTAATTNGATTTCTATGTATAAAACAGCTTTGAAGTTGTAAATGTAGTTTCCAATCGTTA 
GTTAATGCTACATTAGTTAGCAATATTTGAAAATTTTATTGGTATAAAATGTTTTAATTA 
35 CTAAGGCTGTTTGTAGGCTGCATAGTAAGCTTCAGGATCATCACACGTTTTTTCCCTGTA 
ATTGGGTGGGATAGGAAGCCTTTAAGGTCTCTTGCTTCTCATTGGGTGGGCTACAAGGAG 
CAGCAGCCATCCGTNGGCAAGGCTTTGTGGATNCT 

BI260259 

40 GGAAGAGAAAGATCGTCCAGAGGTTCCATCGCACACACTGTATGACGTCATTGGAAATGA 
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AGGAAGACGACTTTGTCTGCTGGCTTCTTGTGAGTGGCAAGCCACTGCAGTGGACCCATC 
TCTGCTATTTTCTTTATTCTGCCACTTTTCAAGGATGACCTCACTTCTGCAATGGTTTTG 
AAGAAAGTTCAGTGAAGTAACAAATTGTGTGATGGAAACATATTTCAGATGGGTAAACCA 
CAAGAACCTTAATGGGGGGCAGTAGTGTGGTGGTAGAAAAGGAAGTCTTCTTGATCCTTT 
5 CTGTGAGAGGAGAAAAGCATTAGTTATCTGTGAACAGCAAACAGCAGGCATTTCACATCT 
GTAAACCATCCCTGACAAATGATCCCTTGCTAGAGAATGTCAGCTGAGCACCAAGGGGCC 
TTGTTAGTGACAGCAAGGACAAAACATCCTGATGTTCCTTTTGAACACATCAGCTGAAAC 
ACACTGATGCTCTAAACCGTTAACTATTTATTAATGGGGGAACATAGGTCTCAACTCATG 
TACGACCAGGCTGGAGTGCAGTGGGGTTGAACATCGACAGACATAGCAAACCACCGATCA 
1 0 CTAGGGAAACAACGCACAGAACTCCAGACTTAAAACACC 

AA287951 

ATTCGGCACCTGGGGGGCAGACACTGAGAGCATTGTAATCGTCTTTTGTATCAATCTCTC 
TAAAGTAGACCACCACGTATTTGTGCAGATGAATCTGGCTTCTTAGATCACTGCAGAAAA 

1 5 GGTTAAAGGCAAGGGGGAAGAGGTCTTGAGAGTTCTCACTGGGACTGCCCTCGCTCTTGC 
CACAGGTACCATCGCACACACTGTTGACGTCATTGGAAAGAAGGAAGACGACTTTGTCTG 
CTGCCTTCTTTTGAGTGGCAAGCCACTGCACTGGACCCATCTCTGCTATTTTCTTTTTCT 
GCCACTTTTCAAGGATGACCTCACTTCTGCAATGGTTTTGAAGAAATTCAGTGAAGTAAC 
AAATNTGTGTGATGGAAACATATTTCAGATGGGTAAACCACAAGAACCTTAATGGGGGGC 

20 AGTAGTGTGGTGGTAGAAAAGGAAGTCTTCTTGATCCTTTCTGTGAGAGGAGAAAGC 

AA287911 

TTTTGATGGTCCACTTCCATTTAATGAATTAGTAAATATCTTTTCTCATGATTTTAATTA 
CATTTTTTTCTCTAGCTTACTTTATTATAATACAGCACATAATACACCTAACATGCAAAA 

25 TATGTGTTAATTGGCTGTTTATGTTATTGGTAAGACTTCCAGTCAACAGTAGGCTATTAG 
AAGTTAAGTTGTGGGAAAATCAAAGGTTATAGGAGATTTTCAACTGCATGCAGGGCCGGT 
GCCCTCCCCACTGTGTTGTTCAAGGGTCAGCTGTACTCTCTAAGGGCTTTGCTAACTTCA 
AAACATGGAGTATTTGAATACAGAAACCAGAGCATTTACATACTCAGCTCAAGGCAGAGC 
TATTAAAAAAACTCCTCTTCTCCATATGTAGGAAAGGAAATACAAATGCATCCTTTGAGT 

30 CATTTGTGATGT 

T97852 

AACAGTTGTGCTCTGCCCACAAACAGGCGTCCCTTTCCCTCTGGATAACAACAAAAGCAA 
GCCGGGANGNCTGNCGCTCTCCTCCTGCTGTCTCTGCTGGTGGCCACATGGGTGCTGGTN 
35 GCAGGGATCTATCTAATGTNGAGGCACGAAAGGGATCAAGAGGACTTCCTTTTCTACCAC 
CACACTACTGCCCCCCATTAAGGTTCTTGTNGGTTTACCCATCTGGAAATATGTTTCCAT 
CACACAATTTGTTACTTCACTGGAATTTCTTCAAAACCATTGGCAGGANGTGAGGGTCAT 
CCTTGGAAAAGTGGGC 

40 T97745 
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CCTCACTTCTGCAATGGTTTTGAAGAAATTCAGTGAAGTAACAAATTGTGTGATGGAAAC 
ATATTTCAGATGGGTAAACCACAAGAACCTTAATGGGGGGCAGTAGTGTGGTGGTAGAAA 
AGGAAGTCTTCTTGATCCTTTCGTGCCTCCACATTAGATAGATCCCTGCCACCAGCACCC 
ATGTGGCCACCAGCAGAGACAGCAGGAGGAGAGGCAGCCAGCCTCCCGGCTTTGCTTTTG 
5 TTGTTATCCAGAGGGGAAAGGGGACGCCTGTTTNTGGGGCAGAGCACAACTGTTTCCCTC 
GTGCCCGAATTCTTTGGGCCTTCGAGGGGCCAAATTTCCCTATTAGGTGAGGTCGTATTT 
TAAATTTCGGTAATTCATGGTCATAGGCTTGTTTTTCCCCG 

N40294 

GTTTCAACACAATTTTGGATCAGCTGCCTGTTTGCAAAAACATAATATATTTCTGTTAAA 
CAGTTCTTCACCTAACAGCATATTGCTCTTATAACTGGTAGAGCTGTTTCAAAGGAAGTT 
GGTTTCTGGTCCAAGTTTTGACCTAAACCATGTCCATCTTCTATTACCAGCACTTACAAG 
CACTGTGAAAACTGATCATGACAAATAAGTAAAATTTGCTACATTAAACATATTGCCTCA 
GCCATTACTAAGCGTCCACTTGTAAAGCTGGACACAGTTTTTACTTTATGCTTCATTTTG 
ATTTTTTATCCGTAAGACATAAATTAGAAGGCATGAGGTGGCCCTTTAAGGATAATCTGC 
AAATATACACATTTTAAATAGTCATCCATCTGGAAATCGNTCCACCATTCCAGGGGAAGG 
ATTCCAGGTATTGGTGCTGTGGTGGAAATAAAGCATTCCCCNGGGAAAAAAACCATTTTA 
TGNCTAAATAATTACCACCATTAACCTCNTGGGGTT 

20 AA809841 

GAATACTAACTCTAAGAACCCCTCACTGATTCACTCAATAGCATCTTAAGTGAAAAACCT 
TCTATTACATGCAAAAAATCATTGTTTTTAAGATAACAAAAGTAGGGAATAAACAAGCTG 
AACCCACTTTTACTGGACCAAATGATCTATTATATGTGTAACCACTTGTATGATTTGGGA 
TTTGCAT 

25 

AA832389 

TTTTTTACAACTTCAAAGCTGTTTTATACATAGAAATCAATTACAGTTTTAATTGAAAAC 
TATAACCATTTTGATAATGCAACAATAAAGCATCTTCAGCCAAACATCTAGTCTTCCATA 
GACCATGCATTGCAGTGTACCCAGAACTGTTTAGCT 

30 

H 14692 

CTGAGTGTGATGGTGTAAGCCTGTGGTCCCAGCTACTAGGGAGGCTGAGATGGGATTACA 
GGTGTGAGCCACGGCGCCTGGCCTAAAAGCATCTTTTTCTTTAACGCAGAGGTTATGTTG 
TATTATTAGCATAAATGTTTTTTTCTGGGAATGCTTATTTCACACAGCACAATACTGAAT 

3 5 CTTCTCTGGAATGTGGATCGATTTCAGATGGATGACTATTAAAATGTGTATATTTGCAGA 
TTATCCTTAAAGGGCCACCTCATGCCTTCTAATTTATGTCTTACGGATAA7^AAATCAAAA 
TGAAGCATAAAGTAAAAACTGTGTCCAGCTTTACAAGTGGACGCTTAGTAATGGCTGAGG 
CAATATGTTTAATGTAGCCAAATTTTACTTATTTGTCCATGATCCAGTTTTTCACAGTGC 
TTGTTAAGTGCTGGTAATTAGGAAGGTGGGACATGGGTTAGGTCAAAACTTGGGACCNGA 

40 AACCAACTTGN 
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AA732635 

TTTTTTTTTTACAACTTCAAAGCTGTTTTATACATAGAAATCAATTACAGTTTTAATTGA 
AAACTATAACCATTTTGATAATGCAACAATAAAGCATCTTCAGCCAAACATCTAGTCTTC 
5 CATAGACCATGCATTGCATTGTACCCAGAACTGTTTAGCTAATATTCTATGTTTAATTAA 
TGAATACTAACTCTAAGAACCCCTCACTGATTCACTCAATAGCATCTTAAGTGAAAAACC 
TTCTATTACATGCAAAAAATCATTGGTTTT 

AA928257 

1 0 TTTTCTGAGTAAGAACAGGCTTTATTTGTAAAACCACTCGTGACTCTTTACAAAGCAGGA 
TACACAGAAGGGAAAAAAATACACAGTGCAAAATGGATGTTCTGAGTGCCACAAGGATCT 
GCTGAAAAAAGCCAAAGATGTAAGATGGCTGGGTATATATGAGAATGAATATTTCACTAT 
ATTCTGATTCAATTACCAGTCTCAGTGGCCCAGGATGAGCTTTTGGTGTGGTCACATGGC 
CAACATTTGGATAACAAATGAGGAATAATGGTACCGCCTCACTAGTGCCTGAGAACAGCA 

1 5 TGTTCTGGAAAATGTCTCTGGAGTTAGAGATGTGTTAGCTTTTTCATTACAGATGGAGAA 
ATACAATGTTTACACAACAGTCCAGGGGTGGGGTCAAAAGTTGGAAGGTGTCATTAGACG 
CAGCCAAATAAAGTGAAGACAACCCAGGTGACTGGCAGCCCTGACTTGTGCGTGGGCG 

AM 84427 

20 TTTCTGAGTAAGAACAGGCTTTATTTGTAAAACCACTCGTGACTCTTTACAAAGCAGGAT 
ACACAGAAGGGAAAAAAATACACAGTGCAAAATGGATGTTCTGAGTGCCACAAGGATCTG 
CTGAAAAAAGCCAAAGATGTAAGATGGCTGGGTATATATGAGAATGAATATTTCACTATA 
TTCTGATTCAATTACCAGTCTCAGTGGCCCAGGATGAGCTTTGGTGGTGGTCACATGGCC 
AACATTTGGATAACAAATGAGGA 

25 

AI298577 

GAGATGGAGGTCTCGCTTTGTGACGTAGCCTGGTCTTGAGCGATCCTTTTGCCTTGGCCT 
TGCCAAAGTGCTGGGATTGGAGGCATGAGCCACTGCACCCACCCCTGTTTTTTTTTTAAG 
TAAACCATTATAATAACTCATTTATAAAAAGGTTACTTCAAGAGGGCTTTCAACTTAAGA 
30 ATTATTTTCATTTTGAACATGAAAAGTTAAATAGTAACTAAGAAACTGAGAACTCTGACA 
GTGACCTCTAATAGGTAACTTTAGGCAAAAGTAGACAAGTTTGTGGGTATTTTGTTGTTC 
ATGTTAAAAGGCACCTGTACAAGAATCAAGATATGAATCTAGTTTGTAGAGGGAAGGTCT 
TATGCAAATAC CAAATCATACAAGTGGT 

35 AI692717 

AGAGATGTTGGTCTCGCTTTGTGACGTAGCCTGGGCTTGAGCGATCCTTTTGCCTTGGCC 
TTGCCAAAGTGCTGGGATTGGAGGCATGAGCCACTGCACCCACCCCTGTTTTTTTTTTAA 
GTAAACCATTATAATAACTCATTTATAAAAAGGTTACTTCAAGAGGGCTTTCAACTTAAG 
AATTATTTTCATTTTGAACATGAAAAGTTAAATAGTAACTAAGAAACTGAGAACTCTGAC 
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AGTGACCTCTAATAGGTAACTTTAGGCAAAAGTAGACAAGTTTGTGGGTATTTTGTTGTT 
CATGTTAAAAGGCACCTGTACAAGAATCAAGATATGAATCTAGTTTGTAGAGGGAAGGTC 
TTATGCAAATACCAAATCATACAAGTGGTTACACATATAATAGATCATTTGGTCCAGTAA 
AAGTGGGTTCAGCTTGTTTATTCCCTACTT 

5 

AA9 10922 

GAGATGGAGGTCTCGCTTTGTGACGTAGCCTGGTCTTGAGCGATCCTTTTGCCTTGGCTT 
GCAAAGTGCTGGGATTGGAGGCATGAGCACTGCACCCACCCCTGTTTTTTTTTTTAAGTA 
AACCATTATAATAACTCATTTATAAAAAGGTTACTTCAAGAG 

10 

H90761 

TTCACTCAATAGCATCTTAAGTGAAAAACCTTCTATTACATGCAAAAAATCATTGTTTTT 
AAGATAACAAAAGTAGGGAATAAACAAGCTGAACCCACTTTTACTGGACCAAATGANCTA 
TTATATGTATAACCACTTGTATGATTTGGTATTTGCATAAGACCTTCCCTCTACAAACTA 
1 5 GATTCATATCTTGATTCTTGTACAGGTGCCTTTTTAATATTCTGTGATGAAATCGTTCAC 
AGTCAGAGTACATGTCTGCTGCATATGGGAAATAGGGACTGTTGTTCTGAGGGACAAGGC 
ACTCAATTCAGCCGTAAAGGCTGACCCGGGCTACTTTTTTTCCANGGGAATACAATTTTT 
TTACCTTGGAATAAAATNGGGCCCGACNGGAC 

20 AI620122 

TTTTTTTTTTTGAGTAAGAACAGGCTTTATTTGTAAAACCACTCGTGACTCTTTACAAAG 
CAGGATACACAGAAGGGAAAAAAATACACAGTGCAAAATGGATGTTCTGAGTGCCACAAG 
GATCTGCTGAAAAAAAGCCAAAGATGTAAGATGGCTGGGTATATATGAGAATGAATATTT 
CACTATATTCTGATTCAATTACCAGTCTCAGTGGCCCAGGATGAGCTTTTGGTGTGGTCA 
25 CATGGCCAACATTTGGATAACAAATGAGGAATAATGGTACCGCCTCACTAGTGCCTGAGA 
ACAGCATGTTCTGGAAAATGTCTCTGGAGTTAGAGATGTGTTAGCTTTTTCATTACAGAT 
GGAGAAATACAATGTTTACACAACAGTCCAGGGGTGGGGTCAAAAGTTGGAAGGTGTCAT 
TAGACGCA 

30 AI793318 

AAATTTTTAACTTTTAATAGTTAAAATAGTTAACTATTGGTATGGTAGGAAATGATAAAG 
TAGACTAGTATCTGTATACATTTTCTGCATTTATGACATACCTTTTTCTTCATTTTTTTC 
AATATTTTAATTGAAAAGTTCATCCGAGTTTCATCTAAGTTTTTTCAAAGTGATACAAAT 
CTCCAAAAAATTTTCCAATATATGTATTGAAAAAATCCAGGTGTAAGTGGCTCTGCGCAG 
35 TCCAAACCTGTGTTGTTCAAGGGTCAACTGTGTATGAATCCAAGCGAAAGCTTTTCTTAA 
CACCTCATAAGAACTATTTTTTAAAAAACAGGAACTAGCATAGAGTAACCATCACAGGTA 
AAGTGTAATTTGTTATCAGCCATCTTTTGCCCATTTCAGTACTGGTAGAAGGCTCAATGG 
TAAAAATAAA 

40 AA962325 
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TTTTTTTTTTTTTTTTTTTTTTTTCTGACTGTCCCGTTTTTATTTTTACCATTGAGCCTT 
CTACCAGTACTGAAATGGGCAAAAGATGGCTGATAACAAATTACACTTTACCTGTGATGG 
TTACTCTATGCTAGTTCCTGTTTTTTAAAAAATAGTTCTTATGAGGTGTTAAGAAAAGCT 
TTCGCTTGGATTCATACACAGTTGACCCTTGAACAACACAGGTTTGGACTGCGCAGACCA 
5 CTTACACCTGGATTTTTTCAATACATATATTGGAAAATTTTTTGGGGATTTGTATCACTT 
TGAAAAAACTTAGATGAAACTCGGATGGACTTTTCCATTAAAATATTGGAAAAAATGAAG 
AAAAAGGT 

AI733290 

TTTTTTTTTTTTTTTTTTTTTTTTCTGACTGGCCCGTTTTTATTTTTACCATTGAGCCTT 
CTACCAGTACTGAAA.TGGGCAAAAGATGGCTGATAACAAATTACACTTTACCTGGGATGG 
TTACTCTATGCTAGTTCCTGTTTTTTAAAAAATAGTTCTTATGAGGGGTTAAAAAAAGCT 
TTCGCTTGGATTCATACACAGTTGACCCTTGAACAACACAGGTTTGGACTGCGCAGAGCC 
ACTTACACCTGGATTTTTTCAATACATATATTGGAAAATTTTTTGGAGATTTGTATCACT 
TTGAAAAAACTTAGATGAAACTCGGATGAACTTTTCAATTAAAATATTGAAAAAAATGAA 
GAAAAAGGTATGTCATAAATGCAGAAAATGTATACAGATACTAGTCTACTTTATCATTTC 
C TAG CAT AC C AATAG 

BQ226353 

TAAAGGAACAGTTGTGCTCTGCCCACAAACAGGCGTCCCTTTCCCTCTGGATAACAGTAA 
GTGCCCAGTAACTTCAACCAGATGATCAAAGTGGCTCACACACAGTCACTGCCCCCCACT 
CAGTATGTGGAAGGGTTGTGTGTATGTGGGCAGTGCAAGGGGTCGCTGCCTGTGTACACT 
GAACTGGGGTGCAGAGAAAGCCAACAGTGCTGTCCCAGAGAACCTAGAATCTGAGTAAGA 
ACAGGCTTTATTTGTAAAACCACTCGTGACTCTTTACAAAGCAGGATACACAGAAGGGAA 
AAAAATACACAGTGCAAAATGGATGTTCTGAGTGCCACAAGGATCTGCTGAAAAAAGCCA 
AAGATGTAAGATGGCTGGGTATATATGAGAATGAATATTTCACTATATTCTGATTCAATT 
ACCAGTCTCAGTGGCCCAGGATGAGCTTTTGGTGTGGTCACATGGCCAACATTTGGATAA 
CAAATGAGGAATAATGGTACCGCCTCACTAGTGCCTGAGAACAGCATGTTCTGGAAAATG 
TCTCTGGAGTTAGAGATGTGTTAGCTTTTTCATTACAGATGGAGAAATACAATGTTTACA 
CAACAGTCCAGGGGTGGGGGTCAAAAGTTGGAAGGTGTCATTAGACGCAGCCAAATAAAG 
TGAAGACCACCCAGGTGACTGGCAGCCCTGACTTGTGCGTGGGCGAAACCTTACAGATTC 
CTGGGGCACTCTGTGCCTGAACTTACCTGGATGGTCTTTGTGAGGCGGGTGGGCACTTAT 
CCTCCATNAATGGTCAGTCTAACAAGACCGGCCTGTAAAAATGGCATCTAATAGGGGCTA 
TGGAATGGAAAACAGTTGGTACCCAGAAATAACTTTAATT 

W04890 

GACAGTCTGGGAGCCCAGAGCTCTGGGAGGAGTNGGGAAAATGCTGCTTCCTGCTGCTTG 
CTTCTAGGCACCTGCTTCCGCCATCTCACTTACCATGGCTAGAGATGGGGGTGAGACTGG 
GGAAGGACAAAAGCAGGGAACAGATAAGGGATGGAAATCAGAAGGGAATATAGAAAGAAC 
40 TCTGGATATGCNAGAAATGCCGGTACCTGAGCATTTTGTATCAATGGGAGTACCCTCTGT 
AACTGCTCAGTAGGTTACAAATGAAGAGTCCACCAGTATTAGAAACAATTTAAACTTGCC 
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AGTACCAACTGGGATGTGTGCCTTCAATTTGAAAATTGTATGTTTTATTTTTTAAATTTG 
GTTAACAGCATTAATTTATAGAGTATTTGATGTCATTTATGGTTCCCGAGGTGTTTCCAA 
CACAATTTTTGGGATCA 

5 BM455231 

CTTTTAATAGTTAAAATAGTTAACTATTGGTATGGTAGGAAATGATAAAGTAGACTAGTA 
TCTGTATACATTTTCTGCATTTATGACATACCTTTTTCTTCATTTTTTTCAATATTTTAA 
TTGAAAAGTTCATCCGAGTTTCATCTAAGTTTTTTCAAAGTGATACAAATCTCCAAAAAA 
TTTTCCAATATATGTATTGAAAAAATCCAGGTGTAAGTGGCTCTGCGCAGTCCAAACCTG 
TGTTGTTCAAGGGTCAACTGTGTATGAATCCAAGCGAAAGCTTTTCTTAACACCTCATAA 
GAACTATTTTTTAAAAAACAGGAACTAGCATAGAGTAACCATCACAGGTAAAGTGTAATT 
TGTTATCAGCCATCTTTTGCCCATTTCAGTACTGGTAGAAGGCTCAATGGTAAAAATAAA 
AACGGGACAGTCAGAAGATCTGGAAGTCCTGACCCTGCTTTCACCTGGCATGTGTAATCC 
AGTCATGCTCGTATCAGTCTCTGTAGGAGCACTTGAAGGTATTACATAAATGCTATCTAA 
CTCTGGGAAACGCCAACATGTGATTGCCTCCAGAGGAATCTTCTTTAAAAAAAAATTCAA 
AATGTTATTTCCTTACTAGGATGTCTTTAAAGAATTATAACCCTTACCGTGCCTCCACAT 
TAGATAGATCCCTGCCACCAGCACCCATGTGGCCACCAGCAGAGACAGCAGGAGGAGAGG 
CAGCCAGCCTCCCGGCTTGCTTTTGTCTGGAAAAAAACAAAGCTTATTCACCTTTGGAAA 
AAAATCCACACTTATCTCTTAATTTAAAAACTAAGACTTGGTATACTTTATAGAGGGTTA 
TTTATTTTTTATTATTTTTTAGTTTTGAGACAGAGTCTCGCTTTGTTGCCTANGCTGGAG 
TGCAGTGGCGCAATCTCGGTTCACTGCAGCCTCCGTTCTCCGGGGTTCAAGGCATGCTGG 
CTCAGCCTCCTGTATAGCTGGGGATTAAAGGCATGTGTTCACGCGGCCCAGCCCCTTTTG 
TAAAAGATTTAGATCCCTTTTAAAACCATCAGTCAGGAGGCTCCTTTAAAAAGTCTGGCC 
ATCTAATCTTTTTTCCCCCAAAAGGGG 

BI492426 

TTTTTTTTTTTCTTTTTTCTGAGTAAGAACAGGCTTTATTTGTAAAACCACTCGTGACTC 
TTTACAAAGCAGGATACACAGAAGGGAAAAAAATACACAGTGCAAAATGGATGTTCTGAG 
TGCCACAAGGATCTGCTGAAAAAAGCCAAAGATGTAAGATGGCTGGGTATATATGAGAAT 
30 GAATATTTCACTATATTCTGATTCAATTACCAGTCTCAGTGGCCCAGGATGAGCTTTTGG 
TGTGGTCACATGGCCAACATTTGGATAACAAATGAGGAATAATCTCGTGC 

BG674622 

AATTTATAGAGTATTGATGTCATTTATGTTTCTGAGGTGTTTCAACACAATTTTGGATCA 
35 GCTGCCTGTTTGCAAAAACATAATATATTTCTGTTAAACAGTTCTTCACCTAACAGCATA 
TTGCTCTTATAACTGGTAGAGCTGTTTCAAAGGAAGTTGGTTTCTGGTCCAAGTTTTGAC 
CTAAACCATGTCCATCTTCTATTACCAGCACTTACAAGCACTGTGAAAACTGATCATGAC 
AAATAAGTAAAATTTGCTACATTAAACATATTGCCTCAGCCATTACTAAGCGTCCACTTG 
TAAAGCTGGACACAGTTTTTACTTTATGCTTCATTTTGATTTTTTATCCGTAAGACATAA 
40 ATTAGAAGGCATGAGGTGGCCCTTTAAGGATAATCTGCAAATATACACATTTTAATAGTC 
ATCCATCTGAAATCGATCCACATTCCAGAGAAGATTCAGTATTGTGCTGTGTGAAATAAG 
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CATTCCCAGAAAAAAAACATTTATGCTAATAATACAACATAACCTCTGCATTAAAGAAAA 
AGATGCTTTTAGGCCAGGCGCCGTGGCTCACGCCTGTAATCCCTGCACTTTGAGAGGCTG 
AGGTGGGTGGATCATGAGGTCAGGAGATCAAGACCATCCTGGCTAACAGGGTGAAACCCC 
GTCTCTACTGGGGATATAACAAAGTTAGCTGGGTGTGGTGGTGGGTGCTTGTGGTCCCAG 
5 CTACTCAGGAGGCTGAGGCAGGAGAATGGCGTGAACCCGGAAGGCAGAGGTTGTAGTGAC 
GCGAGGTTCACGCCACTGCATTCCAGTCTGGG 

BX1 11256 

CAGGAAGNTAAGAACAGTCCTAAAATCTCTTTGGCTTCTTTGTCCTGATATGCACCGGCA 
TTTTCACAGTAGGAACTAGGGTTTCTGTCCAGTTTTTTTGGTTCTTTAAGGAATTAATGT 
TATTCTGGGTACAACTGCTTACATACATAGCACATATAGATGACATTTTTACAGGCCGTC 
TTGTTAGACTGACATACATGGAGGATAGTGCCACCCGCCTCACAAGAACATCAGGTAAGC 
TCAGGCACAGAGTGCCCAGGAATCTGTAAGGCTTCGCCCACGCACAAGTCAGGGCTGCCA 
GTCACCTGGGTTGTCTTCACTTTATTTGGCTGCGTCTAATGACACCTTCCAACTTTTGAC 
CCCACCCCTGGACTGTTGTGTAAACATTGTATTTCTCCATCTGTAATGAAAAAGCTAACA 
CATCTCTAACTCCAGAGACATTTTCCAGAACATGCTGTTCTCAGGCACTAGTGAGGCGGT 
ACCATTATTCCTCATTTGTTATCCAAATGTTGGCCATGTGACCACACCAAAAGCTCATCC 
TGGGCCACTGAGACTGGTAATTGAATCAGAATATAGTGAAATATTCATTCTCATATATAC 
CCAGCCATCTTACATCTTTGGCTTTTTTCAGCAGATCCTTGTGGCACTCAGAACATCCAT 
TTTGCACTGTGTATTTTTT 

BX117618 

AAATTTTTAACTTTTAATAGTTAAAATAGTTAACTATTGGTATGGTAGGAAATGATAAAG 
TAGACTAGTATCTGTATACATTTTCTGCATTTATGACATACCTTTTTCTTCATTTTTTTC 
AATATTTTAATTGAAAAGTTCATCCGAGTTTCATCTAAGTTTTTTCAAAGTGATACAAAT 
CTCCAAAAAATTTTCCAATATATGTATTGAAAAAATCCAGGTGTAAGTGGCTCTGCGCAG 
TCCAAACCTGTGTTGTTCAAGGGTCAACTGTGTATGAATCCAAGCGAAAGCTTTTCTTAA 
CACCTCATAAGAACTATTTTTTAAAAAACAGGAACTAGCATAGAGTAACCATCACAGGTA 
AAGTGTAATTTGTTATCAGCCATCTTTTGCCCATTTCAGTACTGGTAGAAGGCTCAATGG 
TAAAAATAAAAACGGGACAGTCAGAAAAA 

AA682806 

TCTGAGTAAGAACAGGCTTTATTTGTAAAACCACTCGTGACTCTTTACAAAGCAGGATAC 
ACAGAAGGGAAAAAAATACACAGTGCAAAATGGATGTTCTGAGTGCCACAAGGATCTGCT 
35 GAAAAAAGCCAAAGATGTAAGATGGCTGGGTATATATGAGAATGAATATTTCACTATATT 
CTGATTCAATTACCAGTCTCAGTGGCCCAGGATGAGCTTTTGGTGTGGTCACATGGCCAA 
CATTTGGATAACAAATGAGGAATAATGGTACCGCCTCACTAGTGCCTGAGAACAGCATGT 
TCTGGAAAATGTCTCTGGAGTTAGAGATGTGTTAGCTTTTTCATTACAGATGGAGAAATA 
CAATGTTTACACAACAGTCCAGGGGTGGGGTCAAAG 

40 

AI202376 
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CTGACTGTCCCGTTTTTATTTTTACCATTGAGCCTTCTACCAGTACTGAAATGGGCAAAA 
GATGGCTGATAACAAATTACACTTTACCTGTGATGGTTACTCTATGCTAGTTCCTGTTTT 
TTAAAAAATAGTTCTTATGAGGTGTTAAGAAAAGCTTTCGCTTGGATTCATACACAGTTG 
ACCCTTGAACAACACAGGTTTGGACTGCGCAGAGCCACCCTCGTGCCGAATT 

5 

AI658949 

CTGACTGTCCCGTTTTTATTTTTACCATTGAGCCTTCTACCAGTACTGAAATGGGCAAAA 
GATGGCTGATAACAAATTACACTTTACCTGTGATGGTTACTCTATGCTAGTTCCTGTTTT 
TTAAAAAATAGTTCTTATGAGGTGTTAAGAAAAGCTTTCGCTTGGATTCATACACAGTTG 
10 ACCCT 

BG403405 

GGAAATGATAAAGTAGACTAGTATCTGTATACATTTTCTGCATTTATGACATACCTTTTT 
CTTCATTTTTTTCAATATTTTAATTGAAAAGTTCATCCGAGTTTCATCTAAGTTTTTTCA 

1 5 AAGTGATACAAATCTCCAAAAAATTTTCCAATATATGTATTGAAAAAATCCAGGTGTAAG 
TGGCTCTGCGCAGTCCAAACCTGTGTTGTTCAAGGGTCAACTGTGTATGAATCCAAGCGA 
AAGCTTTTCTTAACACCTCATAAGAACTATTTTTTAAAAAACAGGAACTAGCATAGAGTA 
ACCATCACAGGTAAAGTGTAATTTGTTATCAGCCATCTTTGCCCATTTCAGTACTGGTAG 
AAGGCTCAATGGTAAAAATAAAAACGGGACAGTCAGAAGATCTGGAAGTCCTGACCCTGC 

20 TTTCACCTGGCATGTGTAATCCAGTCATGCTCGTATCAGTCTCTGTAGGAGCACTTGAAG 
GTATTACATAAATGCTATCTAACTCTGGGAAACGCCAACATGTGATTGCCTCCAGAGGAA 
TCTTCTTTAAAAAAAAATTCAAAATGTTATTTCCTTACTAGGATGTCTTTAAAGAATTAT 
AACCCTTACCGTGCCTCCACATTAGATAGATCCCTGCAACAGACCCATGTGGCACCAGCA 
GAGACAGCAGGAGGAGAGGCAGCAGCTCCCGGTTGTTTGTCTGGAAAAACAAAGGTTATC 

25 ACTTTG 

BE673417 

CTGACTGTCCCGTTTTTATTTTTACCATTGAGCCTTCTACCAGTACTGAAATGGGCAAAA 
GATGGCTGATAA.CAAATTACACTTTACCTGTGATGGTTACTCTATGCTAGTTCCTGTTTT 
30 TTAAAAAATAGTTCTTATGAGGTGTTAAGAAAAGCTTTCGCTTGGATTCATACACAGTTG 
ACCCT 

AW021469 

GCACGAGATTATTCCTCATTTGTTATCCAAATGTTGGCCATGTGACCACACCAAAAGCTC 
3 5 ATCCTGGGCCACTGAGACTGGTAATTGAATCAGAATATAGTGAAATATTCATTCTCATAT 
ATACCCAGCCATCTTACATCTTTGGCTTTTTTCAGCAGATCCTTGTGGCACTCAGAACAT 
CCATTTTGCACTGTGTATTTTTTTCCCTTCTGTGTATCCTGCTTTGTAAAGAGTCACGAG 
TGGTTTTACAAATAAAGCCTGTTCTTACTCAGAAAAAAAAAAAAAAAAAAA 

40 CF455736 
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NNTTGAACAGGCGTGACGGTCCGGATTCCCGGGATGTTGTGCTCTGCCCACAAACAGGCG 
TCCCTTTCCCTCTGGATAACAACAAAAGCAAGCCGGGAGGCTGGCTGCCTCTCCTCCTGC 
TGTCTCTGCTGGTGGCCACATGGGTGCTGGTGGCAGGGATCTATCTAATGTGGAGGCACG 
AAAGGATCAAGAAGACTTCCTTTTCTACCACCACACTACTGCCCCCCATTAAGGTTCTTG 
5 TGGTTTACCCATCTGAAATATGTTTCCATCACACAATTTGTTACTTCACTGAATTTCTTC 
AAAACCATTGCAGAAGTGAGGTCATCCTTGAAAAGTGGCAGAAAAAGAAAATAGCAGAGA 
TGGGTCCAGTGCAGTGGCTTGCCACTCAAAAGAAGGCAGCAGACAAAGTCGTCTTCCTTC 
TTTCCAATGACGTCAACAGTGTGTGCGATGGTACCTGTGGCAAGAGCGAGGGCAGTCCCA 
GTGAGAACTCTCAAGACCTCTTCCCCCTTGCCTTTAACCTTTTCTGCAGTGATCTAAGAA 
1 0 GCCAGATTCATCTGCACAAATACGTGGTGGTCTACTTTAGAGAGATTGATACAAAAGACG 
ATTACAATGCTCTCAGTGTCTGCCCCAAGTACCACCTCATGAAGGATGCCACTGCTTTCT 
GTGCAGAACTTCTCCATGTCAAGCAGCAGGTGTCAGCAGGAAAAAGATCACAAGCCTGCC 
ACGATGGCTGCTGCTCCTTGTAGCCCACCCATGAGAAGCAAGAGACCTTNAAGGCTTCCT 
ATCCCACCATTACAG 

15 

AW339874 
TTTTTTTTTXTTXCTGAGX ^ G ^ CAGGOT 

AAAGCAGGATACACAGAAGGGAAAAAAATACACAGGGCAAAATGGATGTTCTGAGTGCCA 
CAAGGATCTGCTGAAAAAAGCCAAAGATGTAAGATGGCTGGGTATATATGAGAATGAATA 
20 TTTCACTATATTCTGATTCAATTACCAGTCTCAGTGGCCCAGGATGAGCTTTTGGTGTGG 
TCACATGGCCAACATTTGGATAACAAATGAGGAATAATGGTACCGCCTCACTAGTGCCTG 
AGAACAGCATGTTCTGGAAAATGTCTCTGGAGTTAGAGATGTGTTAGCTTTTTCATTACA 
GATGGAGAAATACAATGTTTACACAAC 

25 BG399724 

CATGATGTTCAGTATGATCAGTTAACCTTAACCTCTGAGCATCCTGAAGCAAAATCTAAA 
TAATGCAGCTATTACCACTGGTGGTCCAGGCTCTGGTGAAGCCCTCTGAGCCCAGGAGGA 
AGAGAAAGCATTGTCCAGAGGTAGGAACACAGTCTGGGAGCCCAGAGCTCTGGGAGGAGT 
GGGAAAATGCTGCTTCCTGCTGCTTGCTTCTAGGCACCTGCTTCCGCCATCTCACTTACC 

30 ATGGCTAGAGATGGGGGTGAGACTGGGGAAGGACAAAAGCAGGGAACAGATAAGGGATGG 
AAATCAGAAGGGAATATAGAAAGAACTCTGGATGTGGAGAAATGCCGGTACCTGAGCATT 
TTGTATCAATGGGAGTACCCTCTGTAACTGCTCAGTAGGTTACAAATGAAGAGTCCACCA 
GTATTAGAAACAATTTAAACTTGCCAGTACCAACTGGGATGTGTGCCTTCAATTTGAAAA 
TTGTATGTTTTATTTTTTAAATTTGTTAACAGCATTAATTTATAGAGTATTGATGTCATT 

35 TATGTTTCTGAGGTGTTTCAA 

BF475787 

TCTGAGTAAGAACAGGCTTTATTTGTAAAACCACTCGTGACTCTTTACAAAGCAGGATAC 
ACAGAAGGGAAAAAAATACACAGTGCAAAATGGATGTTCTGAGTGCCACAAGGATCTGCT 
40 GAAAAAAGCCAAAGATGTAAGATGGCTGGGTATATATGAGAATGAATATTTCACTATATT 
CTGATTCAATTACCAGTCTCAGTGGCCCAGGATGAGCTTTTGGTGTGGTCACATGGCCAA 
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CATTTGGATAACAAATGAGGAATAATGGTACCGCCTCACTAGTGCCTGAGAACAGCATGT 
TCTGGAAAATGTCTCTGGAGTTAGAGATGTGTTAGCTTTTTCATTACAGATGGAGAAATA 
CAATGTTTACACAACAGTCCAGGGGTGGGGTCAAAAGTTGGAAGGTGTCATTAGACGCAG 
CCAAATAAAGTGAAGACAACCCAGGTGACTGGCAGCCCTGACTTGTGCGTGGGCGA 

5 

BF437145 

CTGACTGTCCCGTTTTTATTTTTACCATTGAGCCTTCTACCAGTACTGAAATGGGCAAAA 
GATGGCTGATAACAAATTACACTTTACCTGTGATGGTTACTCTATGCTAGTATCCTGTTT 
TTTAAAAAATAGTTCTTATGAGGTGTTAAGAAAAGCTTTCGCTTGGATTCATACACAGTT 
10 GACCCT 

H64601 

AGGAAGTTAAGAACAGTCCTAAAATCTCTTTGGCTTCTTTGTCCTGATATGCACCGGCAT 
TTTCACAGTAGGAACTAGGGTTTCTGTCCAGTTTTTTTGGTTCTTTAAGGAATTAATGTT 

15 ATTCTGGGTACAACTGCTTACATACATAGCACATATAGATGACATTTTTACAGGCCGTCT 
TGTTAGACTGACATACATGGAGGATAGTGCCACCCGCCTCACAAGAACATCAGGTAAGCT 
CAGGCACAGAGTCCNAGGGNATCTGTAAGGGCTTCGCCCACGCACAAGTCAGGGCTGCCA 
GTCACCNGGGTTGTCTTCACTTTATTTGGGCTGCGTCTAATGACACCTTNCCAACTTTTT 
GACCCCACCCTGGGGCTTGTTGTGTAAACCATTGTTATTTCTCCCNTCTGTAATGGAAAA 

20 AGGTTAACACNTTTTTAACTTCCGGNGACATTTTTC 

AF212365 

gcacgagcga tgtcgctcgt 
cgagagccga ccgttcaatg 
25 catgatctaa tccccggaga 
gcaacagggg actattcaat 
atccgcttgt tgaaggccac 
agctgtgtga ggtgcaatta 
aaatggacat tttcctacat 
30 gcccataata ttcctaatgc 
acctcaccag gctgcctaga 
agcctgtggg atccgaacat 
ttcacaacca ctcccctggg 
gggttttctc aggtgtttga 
35 ccagtgactg gggatagtga 
ggcagcgact gcatccgaca 
ttccctctgg ataacaacaa 
ctgctggtgg ccacatgggt 
atcaagaaga cttccttttc 
40 tacccatctg aaatatgttt 
cattgcagaa gtgaggtcat 
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gctgctaagc 
tggctctgaa 
cttgagggac 
tttgatgaat 
caagatttgt 
cacagaggcc 
cggcttccct 
aaatatgaat 
ccacataatg 
cactgcttgt 
aaacagatac 
gccacaccag 
aggtgctacg 
taaaggaaca 
aagcaagccg 
gctggtggca 
taccaccaca 
ccatcacaca 
ccttgaaaag 



ctggccgcgc 
actgggccat 
ctccgagtag 
gtaagctggg 
gtgacgggca 
ttccagactc 
gtagagctga 
gaagatggcc 
aaatataaaa 
aagaagaatg 
atggctctta 
aagaaacaaa 
gtgcagctga 
gttgtgctct 
ggaggctggc 
gggatctatc 
ctactgcccc 
atttgttact 
tggcagaaaa 



tgtgcaggag 
ctccagagtg 
aacctgttac 
tactccgggc 
aaagcaactt 
agaccagacc 
acacagtcta 
cttccatgtc 
aaaagtgtgt 
aggagacagt 
tccaacacag 
cgcgagcttc 
ctccatattt 
gcccacaaac 
tgcctctcct 
taatgtggag 
ccattaaggt 
tcactgaatt 
agaaaatagc 



cgccgtaccc 
gatgctacaa 
aactagtgtt 
agatgccagc 
ccagtcctac 
ctctggtggt 
tttcattggg 
tgtgaatttc 
caaggccgga 
agaagtgaac 
cactatcatc 
agtggtgatt 
tcctacttgt 
aggcgtccct 
cctgctgtct 
gcacgaaagg 
tcttgtggtt 
tcttcaaaac 
agagatgggt 
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ccagtgcagt ggcttgccac tcaaaagaag gcagcagaca aagtcgtctt ccttctttcc 

aatgacgtca acagtgtgtg cgatggtacc tgtggcaaga gcgagggcag tcccagtgag 

aactctcaag actcttcccc ttgcctttaa ccttttctgc agtgatctaa gaagccagat 

tcatctgcac aaatacgtgg tggtctactt tagagagatt gatacaaaag acgattacaa 

5 tgctctcagt gtctgcccca agtaccacct catgaaggat gccactgctt tctgtgcaga 

acttctccat gtcaagtagc aggtgtcagc aggaaaaaga tcacaagcct gccacgatgg 

ctgctgctcc ttgtagccca cccatgagaa gcaagagacc ttaaaggctt cctatcccac 

caattacagg gaaaaaacgt gtgatgatcc tgaagcttac tatgcagcct acaaacagcc 

ttagtaatta aaacatttta taccaataaa attttcaaat attgctaact aatgtagcat 

10 taactaacga ttggaaacta catttacaac ttcaaagctg ttttatacat agaaatcaat 

tacagtttta attgaaaact ataaccattt tgataatgca acaataaagc atcttcagcc 
aaaaaaaaaa aaaaaa 

AF208110 

15 cggcgatgtc gctcgtgctg ataagcctgg ccgcgctgtg caggagcgcc gtaccccgag 
agccgaccgt tcaatgtggc tctgaaactg ggccatctcc agagtggatg ctacaacatg 
atctaatccc cggagacttg agggacctcc gagtagaacc tgttacaact agtgttgcaa 
caggggacta ttcaattttg atgaatgtaa gctgggtact ccgggcagat gccagcatcc 
gcttgttgaa ggccaccaag atttgtgtga cgggcaaaag caacttccag tcctacagct 

20 gtgtgaggtg caattacaca gaggccttcc agactcagac cagaccctct ggtggtaaat 
ggacattttc ctatatcggc ttccctgtag agctgaacac agtctatttc attggggccc 
ataatattcc taatgcaaat atgaatgaag atggcccttc catgtctgtg aatttcacct 
caccaggctg cctagaccac ataatgaaat ataaaaaaaa gtgtgtcaag gccggaagcc 
tgtgggatcc gaacatcact gcttgtaaga agaatgagga gacagtagaa gtgaacttca 

25 caaccactcc cctgggaaac agatacatgg ctcttatcca acacagcact atcatcgggt 
tttctcaggt gtttgagcca caccagaaga aacaaacgcg agcttcagtg gtgattccag 
tgactgggga tagtgaaggt gctacggtgc agctgactcc atattttcct acttgtggca 
gcgactgcat ccgacataaa ggaacagttg tgctctgccc acaaacaggc gtccctttcc 
ctctggataa caacaaaagc aagccgggag gctggctgcc tctcctcctg ctgtctctgc 

30 tggtggccac atgggtgctg gtggcaggga tctatctaat gtggaggcac gaaaggatca 
agaagacttc cttttctacc accacactac tgccccccat taaggttctt gtggtttacc 
catctgaaat atgtttccat cacacaattt gttacttcac tgaatttctt caaaaccatt 
gcagaagtga ggtcatcctt gaaaagtggc agaaaaagaa aatagcagag atgggtccag 
tgcagtggct tgccactcaa aagaaggcag cagacaaagt cgtcttcctt ctttccaatg 

35 acgtcaacag tgtgtgcgat ggtacctgtg gcaagagcga gggcagtccc agtgagaact 
ctcaagacct cttccccctt gcctttaacc ttttctgcag tgatctaaga agccagattc 
atctgcacaa atacgtggtg gtctacttta gagagattga tacaaaagac gattacaatg 
ctctcagtgt ctgccccaag taccacttca tgaaggatgc cactgctttc tgtgcagaac 
ttctccatgt caagcagcag gtgtcagcag gaaaaagatc acaagcctgc cacgatggct 

40 gctgctcctt gtagcccacc catgagaagc aagagacctt aaaggcttcc tatcccacca 
attacaggga aaaaacgtgt gatgatcctg aagcttacta tgcagcctac aaacagcctt 
agtaattaaa acattttata ccaataaaat tttcaaatat tactaactaa tgtagcatta 
actaacgatt ggaaactaca tttacaactt caaagctgtt ttatacatag aaatcaatta 
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cagctttaat tgaaaactgt aaccattttg ataatgcaac aataaagcat cttccaaaaa 
aaaaaaaaaa aaaaaaaaaa aaaaaaaa 



AF208111 

5 cggcgatgtc gctcgtgctg ataagcctgg ccgcgctgtg caggagcgcc gtaccccgag 

agccgaccgt tcaatgtggc tctgaaactg ggccatctcc agagtggatg ctacaacatg 

atctaatccc cggagacttg agggacctcc gagtagaacc tgttacaact agtgttgcaa 

caggggacta ttcaattttg atgaatgtaa gctgggtact ccgggcagat gccagcatcc 

gcttgttgaa ggccaccaag atttgtgtga cgggcaaaag caacttccag tcctacagct 

10 gtgtgaggtg caattacaca gaggccttcc agactcagac cagaccctct ggtggtaaat 

ggacattttc ctatatcggc ttccctgtag agctgaacac agtctatttc attggggccc 

ataatattcc taatgcaaat atgaatgaag atggcccttc catgtctgtg aatttcacct 

caccaggctg cctagaccac ataatgaaat ataaaaaaaa gtgtgtcaag gccggaagcc 

tgtgggatcc gaacatcact gcttgtaaga agaatgagga gacagtagaa gtgaacttca 

15 caaccactcc cctgggaaac agatacatgg ctcttatcca acacagcact atcatcgggt 

tttctcaggt gtttgagcca caccagaaga aacaaacgcg agcttcagtg gtgattccag 

tgactgggga tagtgaaggt gctacggtgc aggtaaagtt cagtgagctg ctctggggag 

ggaagggaca tagaagactg ttccatcatt cattgctttt aaggatgagt tctctcttgt 

caaatgcact tctgccagca gacaccagtt aagtggcgtt catgggggtt ctttcgctgc 

20 agcctccacc gtgctgaggt caggaggccg acgtggcagt tgtggtccct tttgcttgta 

ttaatggctg ctgaccttcc aaagcacttt ttattttcat tttctgtcac agacactcag 

ggatagcagt accattttac ttccgcaagc ctttaactgc aagatgaagc tgcaaagggt 

ttgaaatggg aaggtttgag ttccaggcag cgtatgaact ctggagaggg gctgccagtc 

ctctctgggc cgcagcggac ccagctggaa cacaggaagt tggagcagta ggtgctcctt 

25 cacctctcag tatgtctctt tcaactctag tttttgaagt ggggacacag gaagtccagt 

ggggacacag ccactcccca aagaataagg aacttccatg cttcattccc tggcataaaa 

agtgntcaaa cacaccagag ggggcaggca ccagccaggg tatgatgggt actacccttt 

tctggagaac catagacttc ccttactaca gggacttgca tgtcctaaag cactggctga 

aggaagccaa gaggatcact gctgctcctt ttttgtagag gaaatgtttg tgtacgtggt 

30 aagatatgac ctagcccttt taggtaagcg aactggtatg ttagtaacgt gtacaaagtt 

taggttcaga ccccgggagt cttgggcatg tgggtctcgg gtcactggtt ttgactttag 

ggctttgtta cagatgtgtg accaagggga aaatgtgcat gacaacacta gaggtagggg 

cgaagccaga aagaagggaa gttttggctg aagtaggagt cttggtgaga ttttgctgtg 

atgcatggtg tgaactttct gagcctcttg tttttcctca gctgactcca tattttccta 

35 cttgtggcag cgactgcatc cgacataaag gaacagttgt gctctgccca caaacaggcg 

tccctttccc tctggataac aacaaaagca agccgggagg ctggctgcct ctcctcctgc 

tgtctctgct ggtggccaca tgggtgctgg tggcagggat ctatctaatg tggaggcacg 

aaaggatcaa gaagacttcc ttttctacca ccacactact gccccccatt aaggttcttg 

tggtttaccc atctgaaata tgtttccatc acacaatttg ttacttcact gaatttcttc 

40 aaaaccattg cagaagtgag gtcatccttg aaaagtggca gaaaaagaaa atagcagaga 

tgggtccagt gcagtggctt gccactcaaa agaaggcagc agacaaagtc gtcttccttc 

tttccaatga cgtcaacagt gtgtgcgatg gtacctgtgg caagagcgag ggcagtccca 

gtgagaactc tcaagacctc ttcccccttg cctttaacct tttctgcagt gatctaagaa 
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gccagattca tctgcacaaa tacgtggtgg 
attacaatgc tctcagtgtc tgccccaagt 
gtgcagaact tctccatgtc aagcagcagg 
acgatggctg ctgctccttg tagcccaccc 
5 atcccaccaa ttacagggaa aaaacgtgtg 
aacagcctta gtaattaaaa cattttatac 
gtagcattaa ctaacgattg gaaactacat 
aatcaattac agctttaatt gaaaactgta 
ttccaaaaaa aaaaaaaaaa aaaaaaaaaa 

10 

AF250309 

atgtcgctcg tgctgctaag cctggccgcg ctgtgcagga gcgccgtacc ccgagagccg 

accgttcaat gtggctctga aactgggcca tctccagagt ggatgctaca acatgatcta 

atcccgggag acttgaggga cctccgagta gaacctgtta caactagtgt tgcaacaggg 

15 gactattcaa ttttgatgaa tgtaagctgg gtactccggg cagatgccag catccgcttg 

ttgaaggcca ccaagatttg tgtgacgggc aaaagcaact tccagtccta cagctgtgtg 

aggtgcaatt acacagaggc cttccagact cagaccagac cctctggtgg taaatggaca 

ttttcctata tcggcttccc tgtagagctg aacacagtct atttcattgg ggcccataat 

attcctaatg caaatatgaa tgaagatggc ccttccatgt ctgtgaattt cacctcacca 

20 ggctgcctag accacataat gaaatataaa aaaaagtgtg tcaaggccgg aagcctgtgg 

gatccgaaca tcactgcttg taagaagaat gaggagacag tagaagtgaa cttcacaacc 

actcccctgg gaaacagata catggctctt atccaacaca gcactatcat cgggttttct 

caggtgtttg agccacacca gaagaaacaa acgcgagctt cagtggtgat tccagtgact 

ggggatagtg aaggtgctac ggtgcagctg actccatatt ttcctacttg tggcagcgac 

25 tgcatccgac ataaaggaac agttgtgctc tgcccacaaa caggcgtccc tttccctctg 

gataacaaca aaagcaagcc gggaggctgg ctgcctctcc tcctgctgtc tctgctggtg 

gccacatggg tgctggtggc agggatctat ctaatgtgga ggcacgaaag gatcaagaag 

acttcctttt ctaccaccac actactgccc cccattaagg ttcttgtggt ttacccatct 

gaaatatgtt tccatcacac aatttgttac ttcactgaat ttcttcaaaa ccattgcaga 

30 agtgaggtca tccttgaaaa gtggcagaaa aagaaaatag cagagatggg tccagtgcag 

tggcttgcca ctcaaaagaa ggcagcagac aaagtcgtct tccttctttc caatgacgtc 

aacagtgtgt gcgatggtac ctgtggcaag agcgagggca gtcccagtga gaactctcaa 

gacctcttcc cccttgcctt taaccttttc tgcagtgatc taagaagcca gattcatctg 

cacaaatacg tggtggtcta ctttagagag attgatacaa aagacgatta caatgctctc 

35 agtgtctgcc ccaagtacca cctcatgaag gatgccactg ctttctgtgc agaacttctc 

catgtcaagc agcaggtgtc agcaggaaaa agatcacaag cctgccacga tggctgctgc 

tccttgtagc ccacccatga gaagcaagag accttaaagg gttccttttc ccatcattta 
caggggaaaa acgtgtgatg ate 



tctactttag agagattgat acaaaagacg 
accacttcat gaaggatgee actgetttet 
tgtcagcagg aaaaagatca caagcctgcc 
atgagaagca agagacctta aaggcttcct 
atgatcctga agcttactat gcagcctaca 
caataaaatt ttcaaatatt actaactaat 
ttacaacttc aaagctgttt tatacataga 
accattttga taatgeaaca ataaagcatc 
aaaaaa 



40 AK095091 

catattagag tctacagata tgectttett acagcaatcc tgcacccaca taaaagctac 
attttcaata caagattaaa aggtattctg caaaatgtgc aaggttttca tgtctgctgg 
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tgtagctgta gtgatggctt catgaatttt tttctttttt gactatggtc cttacgctgg 
attcatttat cttgaaatgg tgaacaatca cagctgcaga ccctcaattt atggtacata 
tcaagcaatt tggctttttt tcttgtaatg aaaaaaaaaa gttttttttg ctttttttca 
tgacactgct tcttgggagc actgccagca ttactagtgg cacttcgtat gggtcctaag 
5 gtgttattga aggtttacga tattgcacta aacacgaaaa ataccagaga accactggag 
atacttttta ctgtgatatg taatttactg gagacaggaa ctgctcgttt ggagatggtt 
agcatcacag ggtgttttaa gtcgatactt gcaacccttg agctcaccac agtagcaaca 
ggaggtggct aggaaattat tcacagcagg acagtacgca ctgcaattaa ttgtatgcag 
ttatgattta ataccacatc tttatgctca cgtttctctc aactgtgaat ggtgccatgt 

10 acagttggta tgtgtgtgtt taagttttga taaattttta acttttaata gttaaaatag 
ttaactattg gtatggtagg aaatgataaa gtagactagt atctgtatac attttctgca 
tttatgacat acctttttct tcattttttt caatatttta attgaaaagt tcatccgagt 
ttcatctaag ttttttcaaa gtgatacaaa tctccaaaaa attttccaat atatgtattg 
aaaaaatcca ggtgtaagtg gctctgcgca gtccaaacct gtgttgttca agggtcaact 

15 gtgtatgaat ccaagcgaaa gcttttctta acacctcata agaactattt tttaaaaaac 
aggaactagc atagagtaac catcacaggt aaagtgtaat ttgttatcag ccatcttttg 
cccatttcag tactggtaga aggctcaatg gtaaaaataa aaacgggaca gtcagaagat 
ctggaagtcc tgaccctgct ttcacctggc atgtgtaatc cagtcatgct cgtatcagtc 
tctgtaggag cacttgaagg tattacataa atgctatcta actctgggaa acgccaacat 

20 gtgattgcct ccagaggaat cttctttaaa aaaaaattca aaatgttatt tccttactag 
gatgtcttta aagaattata acccttaccg tgcctccaca ttagatagat ccctgccacc 
agcacccatg tggccaccag cagagacagc aggaggagag gcagccagcc tcccggcttg 
cttttgtctg gaaaaaacaa agcttattca cctttggaaa acaaatccac acttatctct 
taatttaaaa actaagactt ggtatacttt atagaggttt atttattttt tattattttt 

25 tagttttgag acagagtctc gctttgttgc ctaggctgga gtgcagtggc gcaatctcgg 
ttcactgcag cctccgtctc ccgggttcaa gcaatgctgc ctcagcctcc tgagtagctg 
ggattacagg catgtgtcac cgcgcccagc cactttgtag agatttagat ccctttaaaa 
ccatcagtca gaagctcttt agatagtctg ccaatcatat ctttttccct agagtgtgca 
ggtcttgcat tagattctca aaagggatat gggacccagg aagttaagaa cagtcctaaa 

30 atctctttgg cttctttgtc ctgatatgca ccggcatttt cacagtagga actagggttt 
ctgtccagtt tttttggttc tttaaggaat taatgttatt ctgggtacaa ctgcttacat 
acatagcaca tatagatgac atttttacag gccgtcttgt tagactgaca tacatggagg 
atagtgccac ccgcctcaca agaacatcag gtaagctcag gcacagagtg cccaggaatc 
tgtaaggctt cgcccacgca caagtcaggg ctgccagtca cctgggttgt cttcacttta 

35 tttggctgcg tctaatgaca ccttccaact tttgacccca cccctggact gttgtgtaaa 
cattgtattt ctccatctgt aatgaaaaag ctaacacatc tctaactcca gagacatttt 
ccagaacatg ctgttctcag gcactagtga ggcggtacca ttattcctca tttgttatcc 
aaatgttggc catgtgacca caccaaaagc tcatcctggg ccactgagac tagtaattga 
atcagaatat agtgaaatat tcattctcat atatacccag ccatcttaca tctttggctt 

40 ttttcagcag atccttgtgg cactcagaac atccattttg cactgtgtat ttttttccct 
tctgtgtatc ctgctttgta aagagtcacg agtggtttta caaataaagc ctgttcttac 
tcag 

BM983744 
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TTTTTTTTTTTTTTTTCTGAGTAAGAACAGGCTTTATTTGTAAAACCACTCGTGACTCTT 
TACAAAGCAGGATACACAGAAGGGAAAAAAATACACAGTGCAAAATGGATGTTCTGAGTG 
CCACAAGGATCTGCTGAAAAAAGCCAAAGATGTAAGATGGCTGGGTATATATGAGAATGA 
ATATTTCACTATATTCTGATTCAATTACCAGTCTCAGTGGCCCAGGATGAGCTTTTGGTG 
5 TGGTCACATGGCCAACATTTGGATAACAAATGAGGAATAATGGTACCGCCTCACTAGTGC 
CTGAGAACAGCATGTTCTGGAAAATGTCTCTGGAGTTAGAGATGTGTTAGCTTTTTCATT 
ACAGATGGAGAAATACAATGTTTACACAACAGTCCAGGGGTGGGGTCAAAAGTTGGAAGG 
TGTCATTAGACGCAGCCAAATAAAGTGAAGACAACCCAGGTGACTGGCAGCCCTGACTTG 
TGCGTGGGCGAAGCCTTACAGATTCCTGGGCACTCTGTGCCTGAGCTTACCTGATGTTCT 
1 0 TGTGAGGCGGGTGGCACTATCCTCCATGTATGTCAGTCTAACAAGACGGCCTGTAAAAAT 
GTCATCTATATGTGCTATGTATGTAAGCAGTTGTACCCAGAATAACATTAATCCTCGTGC 
CGAAT 

CB305764 

1 5 TTTTTTTTTTTTTTTGTTGGGCTGAAGATGCTTTATTATTGCATTATCAAAATGGTTATA 
GTTTTCAATTAAAACTGTAATTGATTTCTATGTATAAAACAGCTTTGAAGTTGTAAATGT 
AGTTTCCAATCGTTAGTTAATGCTACATTAGTTAGCAATATTTGAAAATTTTATTGGTAT 
AAAATGTTTTAATTACTAAGGCTGTTTGTAGGCTGCATAGTAAGCTTCAGGATCATCACA 
CGTTTTTTCCCTGTAATTGGTGGGATAGGAAGCCTTTAAGGTCTCTTGCTTCTCATGGGT 

20 GGGCTACAAGGAGCAGCAGCCATCGTGGCAGGCTTGTGATCTTTTTCCTGCTGACACCTG 
CTGCTTGACATGGAGAAGTTCTGCACAGAAAGCAGTGGCATCCTTCATGAGGTGGTACTT 
GGGGCAGACACTGAGAGCATTGTAATCGTCTTTTGTATCAATCTCTCTAAAGTAGACCAC 
CACGTATTTGTGCAGATGAATCTGGCTTCTTAGATCACTGCAGAAAAGGTTAAAGGCAAG 
GGGGAAGAGGTCTTGAGAGTTCTCACTGGGACTGCCCTCGCTCTTGCCACAGGTACCATC 

25 GCACACACTGTTNACGTCATTGGAAAGAAGGAAGACGACTTTGTCTGCTGCCTTCTTTTG 
AGTG 



BM715988 

TGGTTTTTGTTTTTTTTTCATTTTCTGTTGGATTACAGAAAAAGAATGGGACCCATTCAG 
30 GTCTCGATTTCCAAAGGTAAAGATGGAAGGCTGGGCAGACTGGCTTTTGTTACCTGACAT 
GCCGTAGGGTGAGCTTAGAGGAAGAAAGAAAACAATTTTTATTTGGCCAAAACAGAACAA 
ATGCTGAAAAGGAAATCTTGTTTTTTTCCTAAAGCCAAATAGAAATGATTTGGGTATAAT 
TTAAGAGTCCTTGTGTTGTACAGATATGGTGACTGATGTAGTTATTAATACTACCAACTT 
AGTCATCAAGCCTCAATTTTCCTTTACCTGAAGGATTAAGTGAAAGCTTTTGGAGTTCAT 
35 GATGTTCAGTATGATCAGTTAACCTTAACCTCTGAGCATCCTGAAGCAAAATCTAAATAA 
TGCAGCTATTACCACTGGTGGTCCAGGCTCTGGTGAAGCCCTCTGAGCCCAGGAGGAAGA 
GAAAGCATTGTCCAGAGGTAGGAACACAGTCTGGGAGCCCAGAGCTCTGGGAGGAGTGGG 
AAAATGCTGCTTCCTGCTGCTTGCTTCTAGGCACCTGCTTCCGCCATCTCACTTACCATG 
GCTAGAGATGGGGGTGAGACTGGGGAAGGACACAAGCAGGGAACAGATAAGGGATGGAAA 
40 TCAGAAGGGAATATAGAAAGAA.CTCTGGATGTGGAGACATGCCGGTACCTGAGCATTTTG 
TATCAATGGGAGTACCTCT 
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BM670929 

TTTTTTTTTTTTTTTTTTGGCTGAAGATGCTTTATTGTTGCATTATCAAAATGGTTACAG 
TTTTCAATTAAAGCTGTAATTGATTTCTATGTATAAAACAGCTTTGAAGTTGTAAATGTA 
GTTTCCAATCGTTAGTTAATGCTACATTAGTTAGCAATATTTGAAAATTTTATTGGTATA 
5 AAATGTTTTAATTACTAAGGCTGTTTGTAGGCTGCATAGTAAGCTTCAGGATCATCACAC 
GTTTTTTTCCCTGTAATTGGTGGGATAGGAAGCCTTTAAGGTCTCTTGCTTCTCATGGGT 
GGGCTACAAGGAGCAGCAGCCATCGTGGCAGGCTTGTGATCTTTTTCCTGCTGACACCTG 
CTGCTTGACATGGAGAAGTTCTGCACAGAAAGCAGTGGCATCCTTCATGAGGTGGTACTT 
GGGGCAGACACTGAGAGCATTGTAATCGTCTTTTGTATCAATCTCTCTAAAGTAGACCAC 
1 0 CACGTATTTGTGCAGATGAATCTGGCTTCTTAGATCACTGCAGAAAAGGTTAAAGGCAAG 
GGGGAAGAGGTCTTGAGAGTTCTCACTGGGACTTGCCTCGCTCTTGCCACAGGTACCATC 
GCACACACTGTTGACGTCATTGGAAAGAAAGAAGACGACTTTGTCTGCTGCCTTCTT 

BI792416 

1 5 GCTGAAGATGCTTTATTGTTGCATTATCAAAATGGTTACAGTTTTCAATTAAAGCTGTAA 
TTGATTTCTATGTATAAAACAGCTTTGAAGTTGTAAATGTAGT 

BI715216 

CACGCGTCCGATTTTATACCAATAAAATTTTCAAATATTGCTAACTAATGTAGCATTAAC 
20 TAACGATTGGAAACTACATTTACAACTTCAAAGCTGTTTTATACATAGAAATCAATTACA 
GCTTTAATTGAAAACTGTAACCATTTTGATAATGCAACAATAAAGCATCTTCAGCCAAAA 
AAAAAAA 

N56060 

25 AGAAAAAGAAAATAGCAGAGATGGGTCCAGTGCAGTGGCTTGCATAAAAAAGAAGGCAGC 
AGACAAAGTCGTCTTCCTTCTTTCCAATGACGTCAACAGTGTGTGCGATGGTACCTGTGG 
CAAGAGCGAGGGCAGTCCCAGTGAGAACTCTCAAGACCTCTTCCCCCCTTGCCTTTAACC 
TTTTCTGCAGTGATCTAAGAAGCCAGATTCATCTGCACAAATACGTGGTGGTCTACTTTA 
GAGAGATTGATACAAAAGACGATTACAATGCTCTCAGTGTCTGCCCCAAGTACCACCTCA 

30 TGAAGGATGCCACTGCTTTCTGTGCAGAACTTCTCCATGTCAAGCAGCAGGTTTCAGCAG 
G 

CB241389 

TTTTTTTTTTTTTTGTTTGGCTGAAGATGCTTTATTGTTGCATTATCAAAATGGTTACAG 
35 TTTTCAATTAAAGCTGTAATTGATTTCTATGTATAAAACAGCTTTGAAGTTGTAAATGTA 
GTTTCCAATCGTTAGTTAATGCTACATTAGTTAGCAATATTTGAAAATTTTATTGGTATA 
AAATGTTTTAATTACTAAGGCTGTTTGTAGGCTGCATAGTAAGCTTCAGGATCATCACAC 
GTTTTTTCCCTGTAATTGGTGGGATAGGAAGCCTTTAAGGTCTCTTGCTTCTCATGGGTG 
GGCTACAAGGAGCAGCAGCCATCGTGGCAGGCTTGTGATCTTTTTCCTGCTGACACCTGC 
40 TGCTTGACATGGAGAAGTTCTGCACAGAAAGCAGTGGCATCCTTCATGAGGTGGTACTTG 
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GGGCAGACACTGAGAGCATTGTAATCGTCTTTTGTATCAATCTCTCTAAAGTAGACCACC 
ACGTATTTGTGCAGATGAATCTGGCTTCTTAGATCACTGCAGAAAAGGTTAAAGGCAAGG 
GGGAAGAGGTCTTGAGAGTTCTCACTGGGACTGCCCTCGCTCTTGCCACAGGTACCATCG 
CACACACTGTTGACGTCATTGGAAAGAAGGAAGACGACTTTGTCTGCTGCCTTCTTTTGA 
5 GTGGCAAGCCACTGCACTGGACCCATCTCTGCTATTTTCTTTTTCTNGCACTTTTCAAGG 
ATGACTCACTTCTGCAATGGTTTTTGAGAATTCAGTGAAGTACAAATGTGTGATGGAACA 
TAT 

AV660618 

CGCTCGTGCTGCTAAGCCTGGCCGCGCTGTGCAGGAGCGCCGTACCCCGAGAGCCGACCG 
TTCAATGTGGCTCTGAAACTGGGCCATCTCCAGAGTGGATGCTACAACATGATCTAATCC 
CCGGAGACTTGAGGGACCTCCGAGTAGAACCTGTTACAACTAGTGTTGCAACAGGGGACT 
ATTCAATTTTGATGAATGTAAGCTGGGTACTCCGGGCAGATGCCACACCAGAAGAAACAA 
ACGCGAGCTTCAGTGGTGATTCCAGTGACTGGGGATAGTGAAGGTGCTACGGTGCAGCTG 
ACTCCATATTTTCCTACTTGTGGCAGCGACTGCATCCGACATAAAGGAACAGTTGTGCTC 
TGCCCACAAACAGGCGTCCCTTTCCCTCTGGATAACAAC 

BX088671 

GCTGAGTGTGATGGTGTAAGCCTGTGGTCCCAGCTACTAGGGAGGCTGAGATGGGATTAC 
20 AGGTGTGAGCCACGGCGCCTGGCCTAAAAGCATCTTTTTCTTTAACGCAGAGGTTATGTT 
GTATTATTAGCATAAATGTTTTTTTCTGGGAATGCTTATTTCACACAGCACAATACTGAA 
TCTTCTCTGGAATGTGGATCGATTTCAGATGGATGACTATTAAAATGTGTATATTTGCAG 
ATTATCCTTAAAGGGCCACCTCATGCCTTCTAATTTATGTCTTACGGATAAAAAATCAAA 
ATGAAGCATAAAGTAAAAACTGTGTCCAGCTTTACAAGTGGACGCTTAGTAATGGCTGAG 
25 GCAATATGTTTAATGTAGCAAATTTTACTTATTTGTCATGATCAGTTTTCACAGTGCTTG 
TAAGTGCTGGTAATAGAAGATGGACATGGTTTAGGTCAAAACTTGGACCAGAAACCAACT 
TCCTTTGAAACAGCTCTACCAGNTATAAGAGCAATATG 

CB1 54426 

CTGTTGACGTCATTGGAAAGAAGGAAGACGACTTTGTCTGCTGCCTTCTTTTGAGTGGCA 
AGCCACTGCACTGGACCCATCTCTGCTATTTTCTTTTTCTGCCACTTTTCAAGGATGACC 
TCACTTCTGCAATGGTTTTGAAGAAATTCAGTGAAGTAACAAATTGTGTGATGGAAACAT 
ATTTCAGATGGGTAAACCACAAGAACCTTAATGGGGGGCAGTAGTGTGGTGGTAGAAAAG 
GAAGTCTTCTTGATCCTTTCTGTGAGAGGAGAAAAGCATTTGTTATCTGTGAACAGCAAA 
CAGCAGGCTTTCACTCTGTAAACCATCCCTGACAAATGATCCCTTGCTAGAGAATGTCAG 
CTGAGCACCAAGGGCCTTGTTAGTGACAGCAAGGAAAAACATCCTGATGTTCCTTTTGAA 
CACATCACCTGAAACACACTGATGCTTAAACCTTAACTTTTTTTTTTTTGGAGACACAGT 
CTCACTCTGT 

40 CA434589 
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TTTTTTTTTTTTTTTTTTCTGAGTAAGAACAGGCTTTATTTGTAAAACCACTCGTGACTC 
TTTACAAAGCAGGATACACAGAAGGGAAAAAAATACACAGTGCAAAATGGATGTTCTGAG 
TGCCACAAGGATCTGCTGAAAAAAGCCAAAGATGTAAGATGGCTGGGTATATATGAGAAT 
GAATATTTCACTATATTCTGATTCAATTACCAGTCTCAGTGGCCCAGGATGAGCTTTTGG 
5 TGTGGTCACATGGCCAACATTTGGATAACAAATGAGGAATAATGGTACCGCCTCACTAGT 
GCCTGAGAACAGCATGTTCTGGAAAATGTCTCTGGAGTTAGAGATGTGTTAGCTTTTTCA 
TTACAGATGGAGAAATACAATGTTTACACAACAGTCCAGGGGTGGGGTCAAAAGTTGGAA 
G 

10 CA412162 

TTTTTTTTTTTTTTTTTTGGCTGAAGATGCTTTATTGTTGCATTATCAAAATGGTTATAG 
TTTTCAATTAAAACTGTAATTGATTTCTATGTATAAAACAGCTTTGAAGTTGTAAATGTA 
GTTTCCAATCGTTAGTTAATGCTACATTAGTTAGCAATATTTGAAAATTTTATTGGTATA 
AAATGTTTTAATTACTAAGGCTGTTTGTAGGCTGCATAGTAAGCTTCAGGATCATCACAC 

15 GTTTTTTCCCTGTAATTGGTGGGATAGGAAGCCTTTAAGGTCTCTTGCTTCTCATGGGTG 
GGCTACAAGGAGCAGCAGCCATCGTGGCAGGCTTGTGATCTTTTTCCTGCTGACACCTGC 
TGCTTGACATGGAGAAGTTCTGCACAGAAAGCAGTGGCATCCTTCATGAGGTGGTACGTG 
GGGCAGACACTGAGAGCATTGTAATCGTCTTTTGTATCAATCTCTCTAAAGTAGACCACC 
ACGTATTTGTGCAGATGAATCTGGCTTCTTAGATCACTGCAGAAAAGGTTAAAGGCAAGG 

20 GGGAAGA 

CA3 14073 

TTTTTTTTTTTTTTTTTTGAAAGGGTCAGGACTTCCAGATCTTCTGACTGTCCCGTTTTT 
ATTTTTACCATTGAGCCTTCTACCAGTACTGAAATGGGCAAAAGATGGCTGATAACAAAT 

25 TACACTTTACCTGTGATGGTTACTCTATGCTAGTTCCTGTTTTTTAAAAAATAGTTCTTA 
TGAGGTGTTAAGAAAAGCTTTCGCTTGGATTCATACACAGTTGACCCTTGAACAACACAG 
GTTTGGACTGCGCAGAGCCACTTACACCTGGATTTTTTCAATACATATATTGGAAAATTT 
TTTGGAGATTTGTATCACTTTGAAAAAACTTAGATGAAACTCGGATGAACTTTTCAATTA 
AAATATTGAAAAAAATGAAGAAAAAGGTATGTCATAAATGCAGAAAATGTATACAGATAC 

30 TAGTCTACTTTATCATTTCCTACCATACCAATAGTTAACTATTTTAACTATTAAAAGTTA 
AAAATTTATCAAAACTTAAACACACACATACCAACTGTACATGGCACCATTCACAGTTGA 
GAGAAACGTGAGCATAAAGATGTGGTATTAAATCATAACTGCATACAATTAATTGCAGTG 
CGTACTGTCCTGCTGTGAATATTTCCTAGCCCTCGTGCCGAATC 

35 BF921554 

GTGGGTGACCGTGGCTTGCCACTCAAAAGAAGGCAGCAGACAAAGTCGTCTTCCTTCTTT 
CCAATGACGTCAACAGTGTGTGCGATGGTACCTGTGGCAAGAGCGAGGGCAGTCCCAGTG 
AGAACTCTCAAGACCTCTTCCCCCTTGCCTTTAACCTTTTCTGCAGTGATCTAAGAAGCC 
AGATTCATCTGCACAAATACGTGGTGGTCTACTTTAGAGAGATTGATACAAAAGACGATT 
40 ACAATGCTCTCAGTGTCTGCCCCAAGTACCACCTCATGAAGGATGCCACTGCTTTCTGTG 
CATAACTTCTCCATGTCAAGCAGCAGGTGTCAGCAGGAAAAAGATCACAAGCCTGCCACG 
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ATGGCTGCTGCTCCTTGTAGCCCACCCATGAGAAGCAAGAGACCTTAAAGGCTTCCTATC 
CCACCAATTACAGGGAAAAAAACGTGTGATGATCCTGAAGCCACGGTCAA 

BF920093 

5 TAGAGGATCCCGGTCGACGGTGGTTCAGTGATCATCACACTTTTTCCCTGTAATAGGTGG 
GATAGGAAGCCTTTAAGGTCTCTTGCTTCTCATGGGTGGGCTACAAGGAGCAGCAGCCAT 
CGTGGCAGGCTTGTGATCTTTTTCCTGCTGACACCTGCTGCTTGACATGGAGAAGTTATG 
CACAGAAAGCAGTGGCATCCTTCATGAGGTGGTACTTGGGGCAGACACTGAGAGCATTGT 
AATCGTCTTTTGTATCAATCTCTCTAAAGTAGACCACCACGTATTTGTGCAGATGAATCT 
1 0 GGCTTCTTAGATCACTGCAGAAAAGGTTAAAGGCAAGGGGGAAGAGGTCTTGAGAGTTCT 
CACTGGGACTGCCCTCGCTCTTGCCACAGGTACCATCGCACACACTGTTGACGTCATTGG 
AAAGAAGGAAGACGACTTTGTCTGCTGCCTTCTTTTGAGTGGCAAGCCACGGTCAACCCA 
CAAGCCACGGTCAACCCAC 

15 AV685699 

TCTACGTGGTAAGATATGACCTAGCCCTTTTAGGTAAGCGAACTGGTATGTTAGTAACGT 
GTACAAAGTTTAGGTTCAGACCCCGGGAGTCTTGGGCATGTGGGTCTCGGGTCACTGGTT 
TTGACTTTAGGGCTTTGTTACAGATGTGTGACCAAGGGGAAAATGTGCATGACAACACTA 
GAGGTAGGGGCGAAGCCAGAAAGAAGGGAAGTTTTGGCTGAAGTAGGAGTCTTGCGACTG 

20 CATCCGACATAAAGGAACAGTTGTGCTCTGCCCACAAACAGGCGTCCCTTTCCCTCTGGA 
TAACAACAAAAGCAAGCCGGGAGGCTGGCTGCCTCTCCTCCTGCTGTCTCTGCTGGTGGC 
CACATGGGTGCTGGTGGCAGGGATCTATCTAATGTGGAGGCACGAAAGGATCAAGAAGAC 
TTCCTTTTCTACCACCACACTACTGCCCCCCATTAAGGTTCTTGTGGTTTACCCATCTGA 
AATATGTTTCCATCACACAATTTGTTACTTCACTGAATTTCTTCAAAACCATTGCAGAAG 

25 TGAGGTCATCCTTGAAAGTGGCAGAGTAGCAGAGATGGGTCCAGTGCAGTGGCTTGCCAC 
TCGTGCGATGGTCTT 

AV650175 

GGCACGAGCACTGGCTGAAGGAAGCCAAGAGGATCACTGCTGCTCCTTTNTTCTAGAGGA 
30 AATGTTTGTCTACGTGGTAAGATATGACCTAGCCCTTTTAGGTAAGCGAACTGGTATGTT 
AGTAACGTGTACAAAGTTTAGGTTCAGACCCCGGGAGTCTTGGGCATGTGGGTCTCGGGT 
CACTGGTTTTGACTTTAGGGCTNTGTTACAGATGTGTGACCAAGGGGAAAATGTGCATGA 
CAACACTAGAGCTGACTCCATATTTTCCTACTTGTGGCAGCGACTGCATCCGACATAAAG 
GAACAGTTGTGCTCTGCCCACANACAGGCGTCCCTTTCCCTCTGGATAACAACATAAGCA 
35 AGCCGGGAGGCTGGCTGCCTCTCCTCCTGCTGTCTCTGCTGGTGGCACATGGGTGCTGGT 
GGAGGGATCTATCTAATGTGGAGGCACGGATCAAGAAGACTTNCTTNTCTACCACCACAC 
TACTGGCCCCAATAAGGGTCTNGTGGNTACCCCATCTGAATATGTTCATACACAATTTGT 
ACTCACTGAATTCTCAAAACATTGAGAGTGAGGCATCCTGAAAGTGCGAAAAGANATGCN 
AATGGTCAGTGCATGCTGCACTAGCAGCATGGACTT 

40 

BX483104 
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GATCCCGCGCAGTGGCCCGGCGATGTCGCTCGTGCTGCTAAGCCTGGCCGCGCTGTGCAG 
GAGCGCCGTACCCCGAGAGCCGACCGTTCAATGTGGCTCTGAAACTGGGCCATCTCCAGA 
GTGGATGCTACAACATGATCTAATCCCCGGAGACTTGAGGGACCTCCGAGTAGAACCTGT 
TACAACTAGTGTTGCAACAGGGGACTATTCAATTTTGATGAATGTAAGCTGGGTACTCCG 
5 GGCAGATGCCAGCATCCGCTTGTTGAAGGCCACCAAGATTTGTGTGACGGGCAAAAGCAA 
CTTCCAGTCCTACAGCTGTGTGAGGTGCAATTACACAGAGGCCTTCCAGACTCAGACCAG 
ACCCTCTGGTGGTAAATGGACATTTTCCTACATCGGCTTCCCTGTAGAGCTGAACACAGT 
CTATTTCATTGGGGCCCATAATATTCCTAATGCAAATATGAATGAAGATGGCCCTTCCAT 
GTCTGTGAATTTCACCTCACCAGGCTGCCTAGACCACATAATGAAATATAAAAAAAAGTG 
1 0 TGTCAAGGCCGGAAGCCTGTGGGATCCGAACATCACTGCTTGTAAGAAGAATGAGGAGAC 
AGTAGAAGTGAACTTCACAACCACTCCCCTGGGAAACAGATACATGGCTCTTATCCAACA 
CAGCACTATCATTCGG 

CD675121 

GTCTTGCATTAGATTCTCAAAAGGGATATGGGACCCAGGAAGTTAAGAACAGTCCTAAAA 
TCTCTTTGGCTTCTTTGTCCTGATATGCACCGGCATTTTCACAGTAGGAACTAGGGTTTC 
TGTCCAGTTTTTTTGGTTCTTTAAGGAATTAATGTTATTCTGGGTACAACTGCTTACATA 
CATAGCACATATAGATGACATTTTTACAGGCCGTCTTGTTAGACTGACATACATGGAGGA 
TAGTGCCACCCGCCTCACAAGAACATCAGGTAAGCTCAGGCACAGAGTGCCCAGGAATCT 
GTAAGGCTTCGCCCACGCACAAGTCAGGGCTGCCAGTCACCTGGGTTGTCTTCACTTTAT 
TTGGCTGCGTCTAATGACACCTTCCAACTTTTGACCCCACCCCTGGACTGTTGTGTAAAC 
ATTGTATTTCTCCATCTGTAATGAAAAAGCTAACACATCTCTAACTCCAGAGACATTTTC 
CAGAACATGCTGTTCTCAGGCACTAGTGAGGCGGTACCATTATTCCTCATTTGTTATCCA 
AATGTTGGCCATGTGACCACACCAAAAGCTCATCCTGGGCCACTGAGACTGGTAATTGAA 
TCAGAATATAGTGAAATATTCATTCTCATATATACCCAGCCATCTTACATCTTTGGCTTT 
TTTCAGCAGATCCTTGTGGCACTCAGAACATCCATTTTGCACTGTGTATTTTTTTCCCTT 
CT 

BE081436 

TGTGTAACTCTCAAGACCTCTTCCCCCTTGCCTTTAACCTTTTCTGCAGTGATCTAAGAA 
GCCAGATTCATCTGCACAAATACGTGGTGGTCTACTTTAGAGAGATTGATACAAAAGACG 
ATTACAATGCTCTCAGTGTCTGCCCCAAGTACCACCTCATGGAGGATGCCACTGCTTTCT 
GTGCAGAACTTCTCCATGTCAAGTAGCAGGTGTCAGCAGGAAAAAGATCACAAGCCTGCC 
ACGATGGCTGCTGCTCCTTGTAGCCCACCCATGAGAAGCAAGAGACCTTAAAGGCTTCCT 
ATCCCACCAATTACAGGGAAAAAACGTGTGATGAT 

AW970151 

CTGAAATATGTTTCCATCACACAATTTGTTACTTCACTGAATTTCTTCAAAACCATTGCA 
GAAGTGAGGTCATCCTTGAAAAGTGGCAGAAAAAGAAAATAGCAGAGATGGGTCCAGTGC 
40 AGTGGCTTGCCACTCAAAAGAAGGCAGCAGACAAAGTCGTCTTCCTTCTTTCCAATGACG 
TCAACAGTGTGTGCGATGGTACCTGTGGCAAGAGCGAGGGCAGTCCCAGTGAGAACTCTC 
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AAGACCTCTTCCCCCTTGCCTTTAACCTTTTCTGCAGTGATCTAAGAAGCCAGATTCATC 
TGCACAAATACGTGGTGGTCTACTTTAGAGAGATTGATACAAAAGACGATTACAATGCTC 
TCAGTGTCTGCCCCAAGTACCACCTCATGAAGGATGCCACTGCTTTCTGTGCAGAACTTC 
TCCATGTCAAGTAGCAGGTGTCAGCAGGAAAAAGATCACAAGCCTGCCACGATGGCTGCT 
5 GCTCCTTGTAGCCCACCCATGAGAAGCAAGAGACCTTAAAGGCTTCCTATCCCACCAATT 
ACAGGGAAAAAAACGTGTGATGATCCCTGAAGCTTACTATGCAGCCTACANACAGCCTTA 
GTAATAAAACATTTTATCCAATAAAATTTCAAATTTTGCTTAACTATGTGCATAAACTAC 
GATTGAAAACTCTTTACACT 

10 AW837146 

CATTGTGGTTGCAGCTGCATAGTAAGCTTCAGGATCATCACACGTTTTTTCCCTGTAATT 
GGTGGGATAGGAAGCCTTTAAGGTCTCTTGCTTCTCATGGGTGGGCTACAAGGAGCAGCA 
GCCATCGTGGCAGGCTTGTGATCTTTTTCCTGCTGACACCTGCTGCTTGACATGGAGAAG 
TTCTGCACAGAAAGCAGTGGCATCCTTCATGAGGTGGTACTTGGGGCAGACACTGAGAGC 
1 5 ATTGTAATCGTCTTTTGTATCAATCTCCCTAAAGTAGACCACCACGTATTTGTGCAGATG 
AATCTGGCTTCTTAGATCACTGCAGAAAAGGTTAAAGGCAAGGGGGAAGAGGTCTTGAGA 
GTTCTCACTGGGACTGCCCTCGCTCTTGCCACAGGTACCATCGCACACACTGTTGACGTC 
ATTGGAAAGAAGGAAGACGACTTTGTCTGCTGCCTTCTTTTGAGTGGCAAGCCACTGCAC 
TGGACCCATCT 

20 

AW368264 

GTGAATAAGCTTTGTTTTTTCCAGACAAAAGCAAGCCAGGAGGCTGGCTGCCTCTCCTCC 
TGCTGTCTCTGCTGGTGGCCACATGGTTGCTGGTGGCAGGGATCTATCTAATGTGGAGGC 
ACGGTAAGGGTTATAATTCTTTAAAGTCATCCTAGTAAGGAAATAACATTTGGAATTTTT 

25 TTTTAAAGAAGATTCCTCTGGAGGCAATCACCTGTTGGCGTTTCCCAGAGTTAGATAGCA 
TTTATGTAATACCTTCAAGTGCTCCTACAGAGACTGATACGAGCATGACTGGATTACACA 
TGCCAGGTGAAAGCAGGGCCAGGACTTCCAGATCTTCTGACTGTCCCGTTTTTATTTTTA 
CCATTGAGCCTTCTACCAGAACTGAAATGGGCAAAAGATGGCTGATAACAAATTACACTT 
TACCTGTGATGGTTACTCTATGCTAGTTCCTGTTTTTAAAAAAATAGTTCTTATGAGGTG 

30 TCAAGAAAAGCTTTCGCTTGGATTCATACACAGTTGACCCTTGAACAACACAG 

D25960 

GATCCTGAAGCTTACTATGCAGCCTACAAACAGCCTTAGTAATTAAAACATTTTATACCA 
ATAAAATTTTCAAATATTGCTAACTAATGTAGCATTAACTAACGATTGGAAACTACATNN 
3 5 ACAACTTCAAAGCTGTTTTATACATAGAAATCAATTACAGCTTTAATTGAAAACTATAAC 
CATTTTGATAATGCAACANTAAAGCATCTTCAGCCAAA 

AV709899 

GCAACTTCCAGTCCTACAGCTGTGTGAGGTGCAATTACACAGAGGCCTTCCAGACTCAGA 
40 CCAGACCCTCTGGTGGTAAATGGACATTTTCCTATATCGGCTTCCCTGTAGAGCTGAACA 
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CAGTCTATTTCATTGGGGCCCATAATATTCCTAA.TGCAAATATGAATGAAGATGGCCCTT 
CCATGTCTGTGAATTTCACCTCACCAGGCTGCCTAGACCACATAATGAAATATAAAAAAA 
AGTGTGTCAAGGCCGGAAGCCTGTGGGATCCGAACATCACTGCTTGTAAGAAGAATGAGG 
AGACAGTAGAAGTGAACTTCACAACCACTCCCCTGGGAAACAGATACATGGCTCTTATCC 
5 AACACAGCACTATCATCGGGTTTTCTCAGGTGTTTGAGCCACACCAGAAGAAACAAACGC 
GAGCTTCAGTGGTGATTCCAGTGACTGGGGATAGTGAAGGTGCTACGGTGCAGCTGACTC 
CATATTTTCCTACTTGTGGCAGCGACTGCATCCGACATAAAGGAACAGTTGTGCTCTGCC 
CACAAACAGGCGTNCCTTTTCCTCTGGATAACAACAAAAGCAAGCCGGGAGGCTTGGCTG 
CTCTCCTTCTGCTGGCCTTTGCTGTGGCCACATTGGTGCTGGTGGCAGGGATCTATCTAA 
1 0 TGTGGATGCACGTCTCGTGGTTTACCCATCTGAAATATGTTCN 

BX431018 

ATTTTTCCTCTTGTGGCAGCGACTGGCATCCGACATAAAGGAACAGTTGTGCTCTGCCCA 
CAAACAGGCGTCCCTTTCCCTCTGGATAACAACAAAAGCAAGCCGGGAGGCTGGCTGCCT 

1 5 CTCCTCCTGCTGTCTCTGCTGGTGGCCACATGGGTGCTGGTGGCAGGGATCTATCTAATG 
TGGAGGCACGAAAGGATCAA.GAAGACTTCCTTTTCTACCACCACACTACTGCCCCCCATT 
AAGGTTCTTGTGGTTTACCCATCTGAAATATGTTTCCATCACACAATTTGTTACTTCACT 
GAATTTCTTCAAAACCATTGCAGAAGTGAGGTCATCCTTGAAAAGTGGCAGAAAAAGAAA 
ATAGCAGAGATGGGTCCAGTGCAGTGGCTTGCCACTCAAAAGAAGGCAGCAGACAAAGTC 

20 GTCTTCCTTCTTTCCAATGACGTCAACAGTGTGTGCGATGGTACCTGTGGCAAGAGCGAG 
GGCAGTCCCAGTGAGAACTCTCAAGACCTCTTCCCCCTTGCCTTTAACCTTTTCTGCAGT 
GATCTAAGAAGCCAGATTCATCTGCACAAATACGTGGTGGTCTACTTTAGAGAGATTGAT 
ACAAAAGACGATTACAATGCTCTCAGTGTCTGCCCCAAGTACCACCTCATGAAGGATGCC 
ACTGCTTTCTGTGCAGAACTTCTCCATGTCAAGCAGCAGGTGTCAGCAGGAAAAAGATCA 

25 CAAGCCTGCCACGATGGCTGCTGCTCCTTGTAGCCCACCCATGAGAAGCAAGAGACCTTA 
AGGCTTCTATCCCACCANTACAGGNAAAAACGTGTGATGATCCTGAAGCTTACTATGCAG 
CCTACAACAGGCTTAGTATTAAAACATTTATACCCATAAATTTTCAAATTGCT 

AL535617 

30 TAGGTGACACTATAGAACAAGTTTGTACAAAAAAGCAGGCTGGTACCGGTCCGGAATTCC 
CGGGATAGTGGMCCGGCGAKGTCGCTCGTGCTGCTAAGCCTGGCCGCGCTGTGCAGGAGC 
GCCGTACCCCGAGAGCCGACCGTTCAATGTGGCTCTGAAACTGGGCCATCTCCARAGTGG 
ATGSKACAACATGATCTAATCCCGGGAGACTTGAGGGACCTCCGAGTAGAACCTGTTACA 
ACTAGTGTTGCAACAGGGGACTATTCAATTTTGATGAATGTAAGCTGGGTACTCCGGGSA 

35 GATGCCAGCATCCGCTTGTTGAAGGCCACCAAGATTTGTGTGAMGGGCAAAAGCAACWTC 
CAGTCCTACAGCWGTGTGAGGTAGCAATTACACAGAGAGCACATATCCAGACTCTAGACC 
AGACCCTCTGGWGGTAAATGGACATTTTCCTATATCGGCTTCCCTGTAGAGCTGAACACA 
GTCTATATTCATTGGGGCCCAWAATAWWCCTAATGCAAATATGAATGAAGATGGCCCTTC 
CATGTCTGTGAATTTCACCTCACCAGGCTGCCTAGACCACATAATGAAATAWAAAAAAAA 

40 GTGTGTCAAGGCCGGAAGCCTGTGGGATCCGAACATCACTGCTTGTAAGAAGAATGARGA 
GACAGTAGAAGTGAACTTCACAACCACTCCCCTGGGAAACAGATAMATKGCTCTTATCCA 
ACACARMACTATCATCGGGTTTTCTCAGGTGTTTGAGCCACACCAGAAGAAACAAACGCG 
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AGCTTCAGTGGTGATTCCAGTGACTGGGGATAGTGAAGGTGCTACGGTGCAGCTGACTCC 
ATATTTTCCTACTTGTGGCAGCGWCTGCATCCGACATAAAGGAACAGTTGTGCTCTGCCC 
ACAAACAGGCGTCCCTTTYCCTCTGGATAACAACAAAAGCAACYGGGAGSTGGYTGYCT 

5 AL525465 

WAATWAKADDRATANHTGAAAACTATAACCATTTNTGATAATNGNAANAATAAAGCATCT 
TCAGCCAAACATCTAGTCTTCCATAGACCATGCATTGCAGTGTACCCAGAWCTGTTTAGC 
TAATATTCTATGTTTAATTAATGAATACTAACTCTAAGAACCCCTCACTGATTCACTCAA 
TAGCATCTTAAGTGAAAAACCTTCTATTACATGCAAAAAATCATTGTTTTTAAGATAACA 

1 0 AAAGTAGGGAATAAACAAGCTGAACCCACTTTTACTGGACCAAATGATCTATTATATGTG 
TAACCACTTGTATGATTTGGTATTTGCATAAGACCTTCCCTCTACAAACTAGATTCATAT 
CTTGATTCTTGTACAGGTGCCTTTTAACATGAACAACAAAATACCCACAAACTTGTCTAC 
TTTTGCCTAAAGTTACCTATTAGAGGTCACTGTSAGAGTKCTCAGTTTCTTAGTTACTAT 
TTAASTTTTSATGTTCAAAATGAAAATAATTCTKAAGTKGAAAGSGCTCTTGAAGTAACC 

1 5 TTTTTATAAATGAGTTATTATAATGGTTTACTTAAATAAAAVAGAGGGGKTTTTGCGGTG 
GCTCATGCCTCCAATCCCAGCACTTTGGCAAGGCCAAGGCAAAAVGATCGCTCAAGACCA 
GGCTACGTCACAAAGCGAGACCTCCATCTCTACAAAAGATTTAAAAAATTAGCTGAGTGT 
GATGGTGTGAGCCTGTGGTCCCAGCTACTAGGGAGGCTGAGATGGGAGGATCACTTGAGC 
CCTGGAGGTCAAGGGTGCAGTAAACGGTGATTGTGCCACTGCACTCCATCCTGGGTGAGA 

20 GCAGACCCTGTCTAAAACAAACAAACGAAAAAACCCCCACAGAATGACAGAACATAAAAG 
ATGCACATTTTGTCTTCCAACTTTTTACTCTTCTAAAAGCATCTTTTTTAAATTTTTTAA 
ATTTTTTTTTTTTTGAGACAGAGTTTCACTCTGTCACACAGGCTGGAGTGMGTGGCGTGA 
CTCGGCTCACTAMAACTCTGCYTCCGGGGTYACSCATCTCCTGCWCAGCTCCTGAGAAGC 
KGGAYAMAGGMCCACACAAACCAGTAAYTTTATWTTTTGAAAAAGGGTTYACCTGTASMA 

25 GRAGGCTGAATCCGACMAARTMACCMCCACYYCAAADGAGGAWAAGKGKRSMGGSCBGGC 
A 

BX453536 

TTATGGGGGGCAGTAGTGTGGTGGTAGAAAAGGAAGTCTTCTTGATCCTTTCGTGCCTCC 
30 CATTAGATAGATCCCTGCCACCAGCACCCATGTGGCCACCAGCAGAGACAGCAGGAGGAG 
AGGCAGCCAGCCTCCCGGCTTGCTTTTGTTGTTATCCAGAGGGAAAGGGACGCCTGTTTG 
TGGGCAGAGCACAACTGTTCCTTTATGTCGGATGCAGTCGCTGCCACAAGTAGGAAAATA 
TGGAGTCAGCTGCACCGTAGCACCTTCACTATCCCCAGTCACTGGAATCACCACTGAAGC 
TCGCGTTTGTTTCTTCTGGTGTGGCTCAAACACCTGAGAAAACCCGATGATAGTGCTGTG 
35 TTGGATAAGAGCCATGTATCTGTTTCCCAGGGGAGTGGTTGTGAAGTTCACTTCTACTGT 
CTCCTCATTCTTCTTACAAGCAGTGATGTTCGGATCCCACAGGCTTCCGGCCTTGACACA 
CTNTNTTTTATATTTCATTATGTGGTCTAGGCAGCCTGGTGAGGTGAAATTCACAGACAT 
GGAAGGGCCATCTTCATTCATATTTGCATTAGGAATATTATGGGCCCCAATGAAATAGAC 
TGTGTTCAGCTCTACAGGGGAAGCCGATATAGGAAAATGTCCATTTACCACCAGAGGGTC 
40 TGGTCTGAGTCTTGAAGGCCTTTTGTGTTATTGCACCTTACACAGCTGTTAGACTGGGAA 
GTTGCTTTTGCCCCGCACACAAATCTTGTGGGCCTTCAACAGCGGATGCTGCCATTTGCC 
CCGAAGTCCCCAGCTCAATTCATTAAAAATTGAATAGGCCCCTTGTGGCAACCCTAGTTG 
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GTACAGGGTTTTACTTGGGGGGCCCCTCTAAGTTTCCCCGGGATATAAACAAAGTGTGG 



BX453537 

TTATGGGGGGCAGTAGTGTGGTGGTAGAAAAGGAAGTCTTCTTGATCCTTTCGTGCCTCC 
5 ACATTAGATAGATCCCTGCCACCAGCACCCATGTGGCCACCAGCAGAGACAGCAGGAGGA 
GAGGCAGCCAGCCTCCCGGCTTGCTTTTGTTGTTATCCAGAGGGAAAGGGACGCCTGTTT 
GTGGGCAGAGCACAACTGTTCCTTTATGTCGGATGCAGTCGCTGCCACAAGTAGGAAAAT 
ATGGAGTCAGCTGCACCGTAGCACCTTCACTATCCCCAGTCACTGGAATCACCACTGAAG 
CTCGCGTTTGTTTCTTCTGGTGTGGCTCAAACACCTGAGAAAACCCGATGATAGTGCTGT 

1 0 GTTGGATAAGAGCCATGTATCTGTTTCCCAGGGGAGTGGTTGTGAAGTTCACTTCTACTG 
TCTCCTCATTCTTCTTACAAGCAGTGATGTTCGGATCCCACAGGCTTCCGGCCTTGACAC 
ACTTTTTTTTATATTTCATTATGTGGTCTAGGCAGCCTGGTGAGGTGAAATTCACAGACA 
TGGAAGGGCCATCTTCATTCATATTTGCATTAGGAATATTATGGGCCCCAATGAAATAGA 
CTGTGTTCAGCTCTACAGGGAAGCCGATATAGGAAAATGTCCATTTACCACCAGAGGGTC 

1 5 TGGTCTGAGTCTGGAAGGCCTCTGTGTAATTGCACCTCACACAGCTGTAGGACTGGGAGT 
TGCTTTTGCCCGTACACAAATCTTGTTGGCCTTCAACAAGCGGATGCTGGCATCTGGCGG 
GGGTACCCAGCTTACATTCATCAAAATTGAATAGTCCCCTTGTTGCAACACTAGTTTGTA 
AACAGGTTCTACTCCGGGGGTCCCCTCAGTCTCCCGG 



20 AV728945 

CAAATATGAATGAAGATGGCCCTTCCATGTCTGTGAATTTCACCTCACCAGGCTGCCTAG 
ACCACATAATGAAATATAAAAAAAAGTGTGTCAAGGCCGGAAGCCTGTGGGATCCGAACA 
TCACTGCTTGTAAGAAGAATGAGGAGACAGTAGAAGTGAACTTCACAACCACTCCCCTGG 
GAAACAGATACATGGCTCTTATCCAACACAGCACTATCATCGGGTTTTCTCAGGTGTTTG 
25 AGCCACACCAGAAGAAACAAACGCGAGCTTCAGTGGTGATTCCAGTGACTGGGGATAGTG 
AAGGTGCTACGGTGCAACTGACTCCATATTTTCCTACTTGTGGCAGCGACTGCATCCGAC 
ATAAAGGAACAGTTGTGCTCTGCCCACAAACAGGCGTCCCTTTCCCTCTGGATAACAAC 



AV728939 

30 GCAAATATGAATGAAGATGGCCCTTCCATGTCTGTGAATTTCACCTCACCAGGCTGCCTA 
GACCACATAATGAAATATAAAAAAAAGTGTGTCAAGGCCGGAAGCCTGTGGGATCCGAAC 
ATCACTGCTTGTAAGAAGAATGAGGAGACAGTAGAAGTGAACTTCACAACCACTCCCCTG 
GGAAACAGATACATGGCTCTTATCCAACACAGCACTATCATCGGGTTTTCTCAGGTGTTT 
GAGCCACACCAGAAGAAACAAACGCGAGCTTCAGTGGTGATTCCAGTGACTGGGGATAGT 

35 GAAGGTGCTACGGTGCAGCTGACTCCATATTTTCCTACTTGTGGCAGCGACTGCATCCGA 
CATAAAGGAACAGTTGTGCTCTGCCCACAAACAGGCGTCCCTTTCCCTCTGGATAACAAC 



AV727345 

GCAAATATGAATGAAGATGGCCCTTCCATGTCTGTGAATTTCACCTCACCAGGCTGCCTA 
40 GACCACATAATGAAATATAAAAAAAAGTGTGTCAAGGCCGGAAGCCTGTGGGATCCGAAC 
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ATCACTGCTTGTAAGAAGAATGAGGAGACAGTAGAAGTGAACTTCACAACCACTCCCCTG 
GGAAACAGATACATGGCTCTTATCCAACACAGCACTATCATCGGGTTTTCTCAGGTGTTT 
GAGCCACACCAGAAGAAACAAACGCGAGCTTCAGTGGTGATTCCAGTGACTGGGGATAGT 
GAAGGTGCTACGGTGCAGCTGACTCCATATTTTCCTACTTGTGGCAGCGACTGCATCCGA 
5 CATAAAGGAACAGTTGTGCTCTGCCCACAAACAGGCGTCCCTTTCCCTCTGGATAACAAC 
AAAAGCAAGCCGGGAGGCTGGCTGCCTCTCCTCCTGCTGTCTCTGCTGGTGGCCACATGG 
GTGCTGGTGGCAGGGATCTATCTAATGTGGAGGCACGAAAGGATCAAGAAGACTTCCTTT 
TTTACCACCACACTACTGTCTCCCATTAAAGATCTTGTGGTTTATCCATCTGAAATATTG 
TTCCATTACACATATTGGTACCTAACTGAAATTCTTTAAAACCATTGCAAATTGAGGTCA 
10 CTCTTGAAAGGGCGTG 

Sequences identified as those of CACNA1D cluster 



BM128550 

1 5 CGGCTCCTACCTTTTGCCCGATCCCCTTCCCCATTCCGCCCCCGCCCCAACGCAGTGCAC 
AGTGCCCTGCACACAGTAGTCGCTCAATAAATGTTCGTGGATGATGATGATGATGATGAT 
GAAAAAAATGCAGCATCAACGGCAGCAGCAAGCGGACCACGCGAACGAGGCAAACTATGC 
AAGAGGCACCAGACTTCCTCTTTCTGGTGAAGGACCAACTTCTCAGCTGAATAGCTCCAA 
GCAAACTGTCCTGTCTTGGCAAGCTGCAATCGATGCTGCTAGACAGGCCAAGGCTGCCCA 

20 AACTATGAGCACCTCTGCACCCCCACCTGTAGGATCTCTCTCCCAAAGAAAACGTCAGCA 
ATACGCCAAGAGCAAAAAACAGGGTAACTCGTCCAACAGCCGACCTGCCCGCGCCCTTTT 
CTGTTTATCACTCAATAACCCCATCCGAAGAGCCTGCATTAGTATAGTGGAATGGAAACA 
. TTTGACATATTTATATTATTGGCTATTTTTTGCCAAT 

25 BI755471 

GAATATGACCCTGAGGCAAAGGGAAGGATAAACACCTTGATGTGGTCACTCTGCTTCGAC 
GCATCCAGCCTCCCCTGGGGTTTGGGAAGTTATGTCCACACAGGGTAGCGTGCAAGAGAT 
TAGTTGCCATGAACATGCCTCTCAACAGTGACGGGACAGTCATGTTTAATGCAACCCTGT 
TTGCTTTGGTTCGAACGGCTCTTAAGATCAAGACCGAAGGGAACCTGGAGCAAGCTAATG 

30 AAGAACTTCGGGCTGTGATAAAGAAAATTTGGAAGAAAACCAGCATGAAATTACTTGACC 
AAGTTGTCCCTCCAGCTGGTGATGATGAGGTAACCGTGGGGAAGTTCTATGCCACTTTCC 
TGATACAGGACTACTTTAGGAAA.TTCAAGAAACGGAAAGAACAAGGACTGGTGGGAAAGT 
ACCCTGCGAAGAACACCACAATTGCCCTACAGGCGGGATTAAGGACACTGCATGACATTG 
GGCCAGAAATCCGGCGTGCTATATCGTGTGATTTGCAAGATGACGAGCCTGAGGAAACAA 

35 AACGAGAAGAAGAAGATGATGTGTTCAAAAGAAATGGTGCCCTGCTTGGAAACCATGTCA 
ATCATGTTAATAGTGATAGGAGAGATTCCCTTCAGCAGACCAATAGCACCACCGTCCCCT 
GCATTGTCCAAAGGCCTTCAATTCCACCTGCAAGTGATACTGAGAAACCGCTGTTTCCTC 
CAGCAGGAAATTCGGGGTGTCATAACCATCATAACCATTAATTCCATAGGAAAGCAAGGT 
TCCCACTTCAACAATGCCAGTCTCGAATAGTGCCAATATGTCCAAAGCTTGCCATGGTAA 

40 GCGGGCCAGCATTGGGAACC 
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BQ549084 

GCACGAGATTAATTAGACTTTTGTATAAGAGATGTCATGCCTCAAGAAAGCCATAAACCT 
GGTAGGAACAGGTCCCAAGCGGTTGAGCCTGGCAGAGTACCATGCGCTCGGCCCCAGCTG 
CAGGAAACAGCAGGCCCCGCCCTCTCACAGAGGATGGGTGAGGAGGCCAGACCTGCCCTG 
5 CCCCATTGTCCAGATGGGCACTGCTGTGGAGTCTGCTTCTCCCATGTACCAGGGCACCAG 
GCCCACCCAACTGAAGGCATGGCGGCGGGGTGCAGGGGAAAGTTAAAGGTGATGACGATC 
ATCACACCTGTGTCGTTACCTCAGCCATCGGTCTAGCATATCAGTCACTGGGCCCAACAT 
ATCCATTTTTAAACCCTTTCCCACAAATACACTGCGTCCTGGTTCCTGTTTAGCTGTTCT 
GAAATACGGTGTGTAAGTAAGTCAGAACCCAGCTACCAGTGATTATTGCGAGGGCAATGG 
10 GACCTCATAAATAAG 

BQ549571 

TTTTTTTTTTTTTTTTTTAGTGGGGAACTACAATTATTAGGACCCATGGATATTGCTGCA 
GTTCAAATACAATACAGTAATTACAAAATATAGACCATCTCTTTACAAATACAAATTATA 
GTATATTACAAGTCATGTACAGTAAATCTATAATTTTAAACAAACTAGTGTATCTAAGTT 
TACCTGGTTGCGAGTGCATTATTATTCCAGTTTACAGTTGCCCTTAGCGTGACAGTCAGA 
AACCGACCATCGGAGTGATATTCTCTTATGTAAACTGGCGTCACATCACAGAAAACCTTA 
TTTATGAGGTCCCATTGCCCTCGCAATAATCACTGGTAGCTGGGTTCTGACTTACTTACA 
CACCGTATTTCAGAACAGCTAAACAGGAACCAGGACGCAGTGTATTTGTGGGAAAGGGTT 
TAAAAATGGATATGTTGGGCCCAGTGACTGATATGCTAGACCGATGGCTGAGGTAACGAC 
ACAGGTGTGATGATCGTCATCACCTTTAACTTTCCCCTGCACCCCGCCGCCATGCCTTCC 
AGTTGGGTGGGCCTGGT 

AI693324 

CTCTGAGCACTACAATCAGCCAGATTGGTTGACACAGATTCAAGATATTGCCAACAAAGT 
CCTCTTGGCTCTGTTCACCTGCGAGATGCTGGTAAAAATGTACAGCTTGGGCCTCCAAGC 
ATACTCTTGTTCTCTTTACAACCGGTTTGATTGCTTCGTGGTGTGTGGTGGAATCACTGA 
GACGATCTTGGTGGAACTGGAAATCATGTCTCCCCTGGGGATCTCTGTGTTTCGGTGTGT 
GCGCCTCTTAAGAATCTTCAAAGTGACCAGGCACTGGACTTCCCTGAGCAACTTAGTGGC 
ATCCTTATTAAACTCCATGAAGTCCATCGCTTCGCTGTTGCTTCTGCTTTTTCTCTTCAT 
TATCATCTTTTCCTTGCTTGGGATGCAGCTGTTTGGCGGCAAGTTTAATTTTGATG 

R25307 

ACCAGCAGACCTGACTGTCCCCAGCAGCTTCCGGAACAAAAACAGCGACAAGAGAGGAGT 
35 GCGGACAGTTGGTGGAGGCAGTCCTGATATCCGAAGCTTGGGACGCTATGCAAGGGACCC 
AAAATTTGTGTCAGCAACAAAACACGAAATCGCTGATGCCTGTGACCTCACCATCGACGA 
GATGGAGAGTGCAGCCAGCACCCTGCTTAATGGGAACGTGCGTCCCCGAGCCAACGGGGA 
TGTGGGCCCCCTCTCACACCGGCAGACTATGAGCTACAGGACTTTGGTCCTGGGCTTACA 
GCGACGAAGAGCCAGACCCTGGGGAGGGATTGAGGGAGGACCTGGGCGGATGAATTGATT 
40 TTGCNTCACCACCTTTGTTAGGCCCCCAGGCGAGGGGCAAG 
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R46658 

TTTTTTTTTTNTTTTTTTTTTTGTGGAAAGATGATAGGTTTATAGTGACTCAAAATATTT 
TAGAAAAATTTCTGTAGTGTCAAGTTCTTTCAAACTTAAAATTTTAACCCCAGAGGATTT 
TCGCTGAATAAATGAGAATTGGCTCTATTTCTTCTACTTCTGGATAGCCCGNGTAAAAAT 
5 ACTAAT 

H29256 

TTTTTTTTTTTTTTTTTTTTTGTGGAAAGATGATAGGTTTATAGNGACTCAAAATATTTT 
AGAAAAATTTCTGTAGTGTCAAGTTCTTTCAAACTTAAAATTTTAACCCCAGAGGATTTT 
CGCTGAATAAATGAGAATTGGCTCTATTTCTTCTACTTCTGGATAGCCCGAGTAAAAATA 
CTAATAATTTCTAGATTTTAGTGGGGAACTACAATTATTAGGACCCATGGATATTGCTGC 
AGTTCAAATACAATACAGTAATTACAAAATATAGACCATCTCTTTACAAATACAANTTAT 
AGNATATTACAAGTCATGTACAGTAAATCTATAATTTTGGACAANCTAGTGTATCTAAGT 
TTACCNGGGGTGCGAGTGCCTTATTNTTCCNGTTTACAGTTGCCCTTAGCGTGACAGTCN 
GGAACCGNCCTTC 

H29339 

GCCTGACTGTCCCCAGCAGCTTCCGGAACAAAAACAGCGACAAGCAGAGGAGTGCGGACA 
NTTTGGTGGAGGCAGTCCTGATATCCGAAGCTTGGGACGCTATGCAAGGGACCCAAAATT 
20 TGTGTCAGCAACAAAACACGAAATCGCTGATGCCTGTGACCTCACCATCGACGAGATGGA 
GAGTGCAGCCAGCACCCTGCTTAATGGGAACGTGCGTCCCCGAGCCAACGGGGATGTGGG 
CCCCCTCTCACACCGGCAGACTATGAGCTACAGGACTTTGGTCCTGGGCTACAGCGACGA 
AGAGCCAGACCCTGGGAGGGATGAGGAGGAC 

25 BG716371 

AGCGGTCGTAATAATGTAGTTCCCCACTAAAATCTAGAAATTATTAGTATTTTTACTCGG 
GCTATCCAGAAGTAGAAGAAATAGAGCAAATTCTCATTTATTCAGCGAAAATCCTCTGGG 
GTTAAAATTTTAAGTTGAAAGAACTTGACACTACAGAAATTTTTCTAAAATATTTGAGTC 
ACTATAAACCTATCATCTTTCCACAAGATATACCAGATGACTATTGCAGTCTTCTCTTGG 
GCAAGAGTTCCATGATTTGATACTGTACCTTGGATCCACCATGGGTGCAACTGTCTTGGT 
TTGTTGTTGACTTGAACCACCCTCTGGTAAGTAAGTGAA.TTACAGAGCAGGTCTAGCTGG 
CTGCTCTGCCCCTTGGGTATCCATAGTTACGGTTTTCTCTGTGGCCCACCCAGGTGTTTT 
TGCATCGCTGGTGCAGAAATGCACAGGTGGATGAGATATAGCTGCTCTTGTCCTCTGGGG 
ACTGGTGGTGCTGCTTAAGAAATAAGGGGTGCTGGGGACAGAGGAGCAACGTGGTGATCT 
ATAGGATTGGAGTGTCGGGGTCTGTACAAATCGTATTGTTGCCTTTTACAAAACTGTGTA 
CTGTATGTTCTCTTTGAGGGCTTTTGTATGCAATTGAATGAGG 

AI537488 

TTTTCTTGTGGAAAGATGATAGGTTTATAGTGACTCAAAATATTTTAGAAAAATTTCTGT 
40 AGTGTCAAGTTCTTTCAAACTTAAAATTTTAACCCCAGAGGATTTTCGCTGAATAAATGA 
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GAATTGGCTCTATTTCTTCTACTTCTGGATAGCCCGAGTAAAAATACTAATAATTTCTAG 
ATTTTAGTGGGGAACTACAATTATTAGGACCCATGGATATTGCTGCAGTTCAAATACAAT 
ACAGTAATTACAAAATATAGACCATCTCTTTACAAATACAAATTATAGTATATTACAAGT 
CATGTACAGTAAATCTATAATTTTAAACAAACCTGTGTATCTAAGTTTACCTGGTTG 

5 

AA458692 

GACAAATAAAGCAATTATAAATGTATCTCACTTTAGAACAGACAAAAAAAGGGCATGCTA 
TGGAAATTGTTTAAATCTCAAGCAACAATGCTGATTAATTTCTGGTCAATAATCGTTCTA 
TAGTTCTCCTTCATGAAGCCTGGTGAGGTTCCAGGAAACAGCTTGATTTGGGAAGCCTCA 
1 0 GCAGAAAAGAAAGCATCTCAGAGGACACATAAAATGTCTGGCAACCCCTCTTGGCGGCCC 
TCATCCAGCAAAGCTTGTGTGGTCTTGGCAACTGTCCTCAGGACTCTGCTTTCAAGATGA 
AAGAGGTGTAGCTTACCCGCTCAATACACCAAGTACAAGATTTAGTACGAAAAATGACCC 
AAAGATGACGAGACTGACAAGATACACCCAGGGCAATTCCAATCCCATAGCATCATTCAT 

15 AI393327 

TTTATATTATTCACCACTTTGTTATGAAGACCTTACAAACCTCTTCTTAAGACATTCTTA 
CTCTGATCCAGGCAAAAACACTTCAAGGTTTGTAAATGACTCTTTCCTGACATAAATCCT 
TTTTTATTAAAATGCAAAATGTTCTTCAGAATAAAACTGTGTAATAATTTTTATACTTGG 
GAGTGCTCCTTGCACAGAGCTGTCATTTGCCAGTGAGAGCCTCCGACAGGGCAGGTACTG 
20 TGCCAGGGCAGCTCTGAAATTATGGATATTCTTATCCTCCTGGTTCCTTCGGTGCCAATG 
GTAACCTAATACCAGCCGCAGGGAGCGCCATTTCTCCTAAAGGGCTACACCACTGTCAAC 
ATTATCCTGGACTCTGTGTCTCTCTCTGTTGGGTCTTGTGGCATCACATCAGGCCAAAAT 
TGCCAGACCAGGACCCTAAGTGTCTGATAGAGGCGATGATCTTTT 

25 Al 520947 

TTTTTTTTTTTTTTTTTTTTTCTTACAAAGAAAAATTTAATATTCGATGAGAGGTTGAAC 
CAGGCTTAAAGCAAACATACTAGGAAATGGGGCAGCCTGTAAGAATGCCAGTTTGTAAGT 
ACTGACTTTGGAAAAGATCATCGCCTCTATCAGACACTTAGGGTCCTGGTCTGGCAATTT 
TGGCCTGATGTGATGCCACAAGACCCAACAGAGAGAGACACAGAGTCCAGGATAATGTTG 
30 ACAGGGGGTAGCCCTTTAGGAGAAATGGCGCTCCCTGCGGCTGGTATTAGGTTACCATTG 
GCACCGAAGGAACCAGGAGGATAAGAATAT 

AI248998 

TGTAAATAACAAACACCACTTTGTTATGAAGACCTTACAAACCTCTTCTTAAGACATTCT 
35 TACTCTGATCCAGGCAAAAACACTTCAAGGTTTGTAAATGACTCTTTCCTGACATAAATC 
CTTTTTTATTAAAATGCAAAATGTTCTTCAGAAT7VAAACTGTGTAATAATTTTTATACTT 
GGGAGTGCTCCTTGCACAGAGCTGTCATTTGCCAGTGAGAGCCTCCGACGGGGCAGGTAC 
TGTGCCAGGGCAGCTCTGAAATTATGGATATTCTTATCCTCCTGGTTCCTTCGGTGCCAA 
TGGTAACCTAATACCAGCCGCAGGGAGCGCCATTTCTCCTAAAGGGCTACACCACTGTCA 
40 ACATTATCCTGGACTCTGTGTCTCTCTCTGTTGGGTCTTGTGGCATCACATCAGGCCAAA 
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ATTGCCAGACCAGGACCCTAAGTGTCTGATAGAGGCGATGATCTTTTCCAAAGTCAGTAC 
TTACAAACTGGCATTCTTACAG 

AI075844 

5 TTTTTTTTTTTGTAAATAACAAACACCACTTTGTTATGAAGACCTTACAAACCTCTTCTT 
AAGACATTCTTACTCTGATCCAGGCAAAAACACTTCAAGGTTTGTAAATGACTCTTTCCT 
GACATAAATCCTTTTTTATTAAAATGCAAAATGTTCTTCAGAATAAAACTGTGTAATAAT 
TTTTATACTTGGGAGTGCTCCTTGCACAGAGCTGTCATTTGCCAGTGAGAGCCTCCGACG 
GGGCAGGTACTGTGCCAGGGCAGCTCTGAAATTATGGATATTCTTATCCTCCTGGTTCCT 
* 10 TCGGTGCCAATGGTAACCTAATACCAGCCGCAGGGAGCGCCATTTCTCCTAAAGGGCTAC 
ACCACTGTCAACATTATCCTGGACTCTGTGTCTCTCTCTGTTGGGTCTTG 

AI869807 

GTAAATAACAAACACCACTTTGTTATGAAGACCTTACAAACCTCTTCTTAAGACATTCTT 
1 5 ACTCTGATCCAGGCAAAAACACTTCAAGGTTTGTAAATGACTCTTTCCTGACATAAATCC 
TTTTTTATTAAAATGCAAAATGTTCTTCAGAATAAAACTGTGTAATAATTTTTATACTTG 
GGAGTGCTCCTTGCACAGAGCTGTCATTTGCCAGTGAGAGCCTCCGACGGGCAGGTACTG 
TGCCAGGGCAGCTCTGAAATATGGATATTCTTACCTCCTGGTTCTTTCGGTGCAAATGGT 
AACCTAATACCAGCCGCAGGGAGCGCCATTTCT 

20 

AI869800 

GTAAATAACAAACACCACTTTGTTATGAAGACCTTACAAACCTCTTCTTAAGACATTCTT 
ACTCTGATCCAGGCAAAAACACTTCAAGGTTTGTAAATGACTCTTTCCTGACATAAATCC 
TTTTTTATTAAAATGCAAAATGTTCTTCAGAATAAAACTGTGTAATAATTTTTATACTTG 
25 GGAGTGCTCCTTGCACAGAGCTGTCATTTGCCAGTGAGAGCCTCCGACGGNGCAGGTACT 
GTGCCAGGGCAGCTCTGAATTATGGATATTCTTATCCTCCTG 

AI243110 

TTTTTCTTACAAAGAAAAATTTAATATTCGATGAGAGGTTGAACCAGGCTTAAAGCAGAC 
30 ATACTAGGAAATGGTGCAGCCTGTAAGAATGCCAGTTTGTAAGTACTGACTTTGGAAAAG 
ATCATCGCCTCTATCAGACACTTAGGGTCCTGGTCTGGCAATTTTGGCCTGATGTGATGC 
CACAAGACCCAACAGAGAGAGACACAGAGTCCAGGATAATGTTGACAGTGGTGTAGCCCT 
TTAGGAGAAATGGCGCTCCCTGCGGCTGGTATTAGGTTACCATTGGCACCGAAGGAACCA 
GGAGGATAAGAATATCCATAATTTCAGAGCTGCCCTGGCACAGTACCTGCCCCGTCGGAG 
35 GCTCTCACTGGCAAATGACAGCTCTGTGCAAGGAGCACTC 

AI955764 

TTATCTTGTGGAAAGATGATAGGTTTATAGTGACTCAAAATATTTTAGAAAAATTTCTGT 
AGTGTCAAGTTCTTTCAAACTTAAAATTTTAACCCCAGAGGATTTTCGCTGAATAAATGA 
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GAATTGGCTCTATTTCTTCTACTTCTGGATAGCCCGAGTAAAAATACTAATAATTTCTAG 
ATTTTAGTGGGGAACTACAATTATTAGGACCCATGGATATTGCTGCAGTTCAAATACAAT 
ACAGTAATTACAAAATATAGACCATCTCTTTACAAATACAAATTATAGGATATTACAAGG 
CATGTACAGTAAATCTATAATTTTAAACAAACTAGTGTATCTAAGTTTACCTGGTTGCGA 
5 GTGCATTATTATTCCAGTTTACAGTTGCCCTTAGCGTGACAGTCAGAAACCGACCATCGG 
AGTGATATTCTCTTATGTAAACTGGCGTCACATCACAGAAAACCTTATTTATGAGGTCCC 
AT 

AA1 92669 

1 0 GCCCTCACAGCCCACCACGCCTGGCCTTCGCCCAATTCTGAAACTTCGTAGGATAGAGCT 
GGAAAGTGCCACATGGTGAAGCGAGATCCAGCTGTCTGGGTGGATGTCGGAGTCCATAGG 
CTGAGCAGAGATGGTTCTTAGTGAGGTTCTCGCTGCCAGTTGACGGTGAAATCATAGCTG 
CCATTTACATTTTGTGAGATTATGAAAAACATAAGACTAAAGAAACTAAATGTGTTATTC 
CTGTGGACACAAAAATGTGTGTTTTTCAGATGGGGAGGGGACCAAAAAGGAAAAACATTT 

1 5 CATCTTAAAACTTTCCTAAGACAAAGGAAAACAAAAAACCATGCTCCTACAACTTCAAAT 
TTTTCTTACCAAAGAAAAATTTAATATTCGATGAGAGGTTGAACCAGGCTTAAAGCAGAC 
ATAC TAGGGAATGGGTGCAGC CTGTAAGAATGC CAGTTT 

AA192157 

20 GTAAATAACAAACACCACTTTGTTATGAAGACCTTACAAACCTCTTCTTAAGACATTCTT 
ACTCTGATCCAGGCAAAAACACTTCAAGGTTTGTAAATGACTCTTTCCTGACATAAATCC 
TTTTTTATTAAAATGCAAAATGTTCTTCAGAATAAAACTGTGTAATAATTTTTATACTTG 
GGAGTGCTCCTTGCACAGAGCTGTCATTTGCCAGTGAGAGCCTCCGACGGGGCAGGTACT 
GTGCCAGGGCAGCTCTGAAATTATGGATATTCTTATCCTCCTGGTTCCTTCGGTGCCAAT 

25 GGTAACCTAATACCAGCCGCAGGAGCGCCATTTCTCCTAAAGGGCTACACCACTGTCAAC 
ATTATCCTGGGACTCTGTGTCTCTCTCTGTTGGGTCTTGTGGCATCACATCAGGCCAAAA 
TTGGCCAGACCAGGACCCCAAGTGGTCTGATAGAAGGCGATGATCTTTTCCAAAGTCAGT 
ACTTACA 

30 AI361691 

GTTTAAAATTATAGATTTACTGTACATGACTTGTAATATACTATAATTTGTATTTGTAAA 
GAGATGGTCTATATTTTGTAATTACTGTATTGTATTTGAACTGCAGCAATATCCATGGGT 
CCTAATAATTGTAGTTCCCCACTAAAATCTAGAAATTATTAGTATTTTTACTCGGGCTAT 
CCAGAAGTAGAAGAAATAGAGCCAATTCTCATTTATTCAGCGAAAATCCTCTGGGGTTAA 
35 AATTTTAAGTTTGAAAGAACTTGACACTACAGAAATTTTTCTAAAATATTTTGAGTCACT 
ATAAACCTATCATCTTTCCACAAGATATACCAGATGACTATTTGCAGTCTTTTCTTTGGG 
CAAGAGTTCCATGATTTTGATACTGTACCTTTGGATCCACCATGGGTTGCAACTGTCTTT 
GGTTTTGTTTGTTTGACTTGAACCA 

40 Al 9 14244 
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TTCGCTGAATAAATGAGAATTGGCTCTATTTCTTCTACTTCTGGATAGCCCGAGTAAAAA 
TACTAATAATTTCTAGATTTTAGTGGGGAACTACAATTATTAGGACCCATGGATATTGCT 
GCAGTTCAAATACAATACAGTAATTACAAAATATAGACCATCTCTTTACAAATACAAATT 
ATAGTATATTACAAGTCATGTACAGTAAATCTATAATTTTAAACAAACTAGTGTATCTAA 
5 GTTTACCTGGTTGCGAGTGCATTATTATTCCAGTTTACAGTTGCCCTTAGCGTGACAGTC 
AGAAACCGACCAT 

AW008769 

TTTTATCTTGTGGAAAGATGATAGGTTTATAGTGACTCAAAATATTTTAGAAAAATTTCT 
10 GTAGTGTCAAGTTCTTTCAAACTTAAAATTTTAACCCCAGAGGATTTTCGCTGAATAAAT 
GAGAATTGGCTCTATTTCTTCTACTTCTGGATAGCCCGAGTAAAAATACTAATAATTTCT 
AGATTTTAGTGGGGAACTACAATTATTAGGACCCATGGATATTGCTGCAGTTCAAATACA 
ATACAGTAATTACAAAATATAGACCATCTCTTTACAAATACAAATTATAGTATATTACAA 
GTCATGTACAGTAAATCTATAATTTTAAACAAACTAGTGTATCTAAGTTTACCTGGTTGC 
15 GAGTGCATTATTATTCCAGTTTACAGTTGCCCTTAGCGTGACAGTCAGAAACC 

AW008794 

TTTTATCTTGTGGAAAGATGATAGGTTTATAGTGACTCAAAATATTTTAGAAAAATTTCT 
GTAGTGTCAAGTTCTTTCAAACTTAAAATTTTAACCCCAGAGGATTTTCGCTGAATAAAT 

20 GAGAATTGGCTCTATTTCTTCTACTTCTGGATAGCCCGAGTAAAAATACTAATAATTTCT 
AGATTTTAGTGGGGAACTACAATTATTAGGACCCATGGATATTGCTGCAGTTCAAATACA 
ATACAGTAATTACAAAATATAGACCATCTCTTTACAAATACAAATTATAGTATATTACAA 
GTCATGTACAGTAAATCTATAATTTTAAACAAACTAGTGTATCTAAGTTTACCTGGTTGC 
GAGTGCATTATTATTCCAGTTTACAGTTGCCCTTAGCGTGACAGTCAGAAACCGACCATC 

25 GGAGTGATATTCTCTTATGTAAACTGGCGTCACATCACAGAAAACCTTATTTATTT 

AA877582 

TTTTTTTTTTAGAGCCAATTCTCATTTATTCAGCGAAAATCCTCTGGGGTTAAAATTTTA 
AGTTTGAAAGAACTTGACACTACAGAAATTTTTCTAAAATATTTTGAGTCACTATAAACC 
30 TATCATCTTTCCACAAGATATACCAGATGACTATTTGCAGTCTTTTCTTTGGGCAAGAGT 
TCCATGATTTTGATACTGTACCTTTGGATCCACCATGGGTTGCAACTGTCTTTGGTTTTG 
TTTGTTTGACTTGAACCACCCTCTGGTAAGTAAGTGAATTACAGAGCAGGTCCAGCTGGC 
TGCTCTGCCCCTTGGGTATCCATAGTTACGGTTTTCTCTGTGGCCCACCCAGGGTGTTTT 
TTGCATCGCTGGTGCAGAAATGCACAGGTGGATGAGATATAGCTGC 

35 

AI051972 

TTTTTTTTTGTAAATAACAAACACCACTTTGTTATGAAGACCTTACAAACCTCTTCTTAA 
GACATTCTTACTCTGATCCAGGCAAAAACACTTCAAGGTTTGTAAATGACTCTTTCCTGA 
CATAAATCCTTTTTTATTAAAATGCAAAATGTTCTTCAGAATAAAACTGTGTAATAATTT 
40 TTATACTTGGGAGTGCTCCTTGCACAGAGCTGTCATTTGCCAGTGAGAGCCTCCGACAGG 
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GCAGGTACTGTGCCAGGGCAGCTCTGAAATTATGGATATTCTTATCCTCCTGGTTCCTTC 
GGTGCCAATGGTAACCTAATACCAGCCGCAGGGAGCGCCATTTCTCCTAAAGGGCTACAC 
CACTGTCAACATTATCCTGGACTCTGTGTCTCTCTCTGTTGAGTCTTGTGGCATCACATC 
AGGCCAAAATTGCCAGACCAGGACCCTAAGTGTCTGATAGAGGCGATGATCTT 

5 

AI017959 

TTTAGAGCCAATTCTCATTTATTCAGCGAAAATCCTCTGGGGTTAAAATTTTAAGTTTGA 
AAGAACTTGACACTACAGAAATTTTTCTAAAATATTTTGAGTCACTATAAACCTATCATC 
TTTCCACAAGATATACCAGATGACTATTTGCAGTCTTTTCTTTGGGCAAGAGTTCCATGA 
1 0 TTTTGATACTGTACCTTTGGATCCACCATGGGTTGCAACTGTCTTTGGTTTTGTTTGTTT 
GACTTGAACCACCCTCTGGTAAGTAAGTGAATTACAGAGCAGGTCCAGCTGGCTGCTCTG 
CCCCTTGGGTATCCATAGTTACGGTTTTCTCTGTGGCCCACCCAGGGTGTTTTTTGCATC 
GCTGGTGCAGAAATGCACAGGTGGATGAGATATAGCTGCTCTTGTCCTCTGGGGACTGGT 
GGTGCTGCTTAAGAAATAAGGGGTGCTGGGGACAGAGGAGCAA 

15 

N79331 

TTTTTTTTTTTTTTQT; ^ T ^ c; ^ CACCACTT 

CTTAAGACATTCTTACTCTGATCCAGGCAAAAACACTTCAAGGTTTGTAAATGACTCTTT 
CCTGACATAAATCCTTTTTG 

20 

N62240.1 

ACAAAGAAAAATTTAATATTCGATGAGAGGTTGAACCAGGCTTAAAGCAGACATACTAGG 
AAATGGTGCAGCCTGTAAGAATGCCAGTTTGTAAGTACTGACTTTGGAAAAGATCATCGC 
CTCTATCAGACACTTAGGGTCCTGGTCTGGCAATTTTGGCCTGATGTGATGCCACAAGAC 
25 CCAACAGAGAGAGACACAGAGTCCAGGNTAATATTGACAGNAGGTGGANGCCCCCCT 

AI240933 

TTTTTTTTTTTTTTTTTTTTGGTCCAAAATTTTTAATAGTATACAGACAACCTGTTAATT 
TTTTTTTTTTTTTTTTTTGGAAATAACAAACACCACTTTGTTATGAAGACCTTACAAA.ee 
30 TCTTCTTAAGACATTCTTACTCTGATCCAGGCAAAAACACTTCAAGGTTTGGAAATGACT 
CTTTCCTGACATAAATCCTTTTTTATTAAAATGCAAAAGGTTCTTCAGAATAAAACTGTG 
TAATAATTTTTATACTTGGGAGTGCTCCTTGCACAGAGCTGTCATTTGCCAG 

AI015031 

3 5 TTTTTCTTACAAAGAAAAATTTAATATTCGATGAGAGGTTGAACCAGGCTTAAAGCAGAC 
ATACTAGGAAATGGTGCAGCCTGTAAGAATGCCAGTTTGTAAGTACTGACTTTGGAAAAG 
ATCATCGCCTCTATCAGACACTTAGGGTCCTGGTCTGGCAATTTTGGCCTGATGTGATGC 
CACAAGACCCAACAGAGAGAGACACAGAGTCCAGGATAATGTTGACAGTGGTGTAGCCCT 
TTAGGAGAAATGGCGCTCCCTGCGGCTGGTATTAGGTTACCATTGGCACCGAAGAGACCA 
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GGAGGATAAGAATATCCATAATTTCAGAGCTGCCCTGGCACAGTACCTGCCCCGTCGGAG 
GCTCTCACTGGCAAATGACAGCTCTGTGCAAGGAGCACTCCCAAGTATAAAAATTATTAC 
ACAGTTTTATTCTG 

5 AI290994 

TAAATAACAAACACCACTTTGTTATGAAGACCTTACAAACCTCTTCTTAAGACATTCTTA 
CTCTGATCCAGGCAAAAACACTTCAAGGTTTGTAAATGACTCTTTCCTGACATAAATCCT 
TTTTTATTAAAATGCAAAATGTTCTTCAGAATAAAACTGTGTAATAATTTTTATACTTGG 
GAGTGCTCCTTGCACAGAGCTGTCATTTGCCAGTGAGAGCCTCCGACGGGGCAGGTACTG 
1 0 TGCCAGGGCAGCTCTGAAATTATGGATATTCTTATCCTCCTGGTTCCTTCGGTGCCAATG 
GTAACCTAATACCAGCCGCAGGGAGCGCCATTTCTCCTAAAGGGCTACACCACTGTCAAC 
ATTATCCTGGACTCTGTGTCTCTCTCTGTTGGGTCTTGTGGCATCACATCAGGCCAAAAT 
TGCCAGACCAGGACCCTAAGTGTCTGATAGA 

15 AA861160 

TTTGTAAATAACAAACACCACTTTGTTATGAAGACCTTACAAACCTCTTCTTAAGACATT 
CTTACTCTGATCCAGGCAAAAACACTTCAAGGTTTGTAAATGACTCTTTCCTGACATAAA 
TCCTTTTTTATTAAAATGCAAAATGTTCTTCAGAATAAAACTGTGTAATAATTTTTATAC 
TTGGGAGTGCTCCTTGCACAGAGCTGTCATTTGCCAGTGAGAGCCTCCGAC 

20 

AA915941 

TTGTAAATAACAAACACCACTTTGTTATGAAGACCTTACAAACCTCTTCTTAAGACATTC 
TTACTCTGATCCAGGCAAAAACACTTCAAGGTTTGTAAATGACTCTTTCCTGACATAAAT 
CCTTTTTTATTAAAATGCAAAATGTTCTTCAGAATAAAACTGTGTAATAATTTTTATACT 

25 TGGGAGTGCTCCTTGCACAGAGCTGTCATTTGCCAGTGAGAGCCTCCGAAGGGGCAGGTA 
CTGTGCCAGGGCAGCTCTGAAATTATGGATATTCTTATCCTCCTGGTTCCTTCGGTGCCA 
ATGGTAACCTAATACCAGCCGCAGGAGCGCCATTTCTCCTAAAGGGCTACACCACTGTCA 
ACATTATCCTGGACTCTGTGTCTCTCTCTGTTGGGTCTTGTGGCATCACATCAGGCCAAA 
ATTGCCAGACCAGGACCCTAAGTGTCTGATAGAGGCGATGATCTTTTCCAAAGTCAGTAC 

30 TTA 



AA493341 

GCTCGACTTTTTTTTTGGGGGAACGTTTTCATTAGGTTAACAGTGTTTGGCAAGCATTGG 
AAACACGGAATCTCACAGACAGATACAGGCAGAAAGAATCACAGTTCAATCCAAAAGCAA 

35 CACACTGAGAGGACATCAGAGTCCAAACACATGCAGAGAAGCTGTCAGGGAGCAGCTAGG 
AGACACGCAGAGTTGCCTCACACGTGGCAGCAGGAGAAGGTGCAACACGGATCCGACTGC 
TTACCCACTAAGGACACCAAGAACCAGGTTAAGGACGAAAAATGAGCCAAGGATGATCAG 
ACTAACAAAATACACCCATGGCCATTCCCATCCTATCGCATCATTTACCCAGTAGAGCAC 
GTCTGTCCAGCCCTCCATGGTGATGCACTGAAACACAGTAAGCATGGCAAAGGCAAAGTT 

40 ATCAAAGTTGGTGATGCCTCCGTTCGGGCCAACCCAGCCACTCCTACATTCCGTGCCATT 
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GGCAGTACACTGGCGTCCATTCCCTGT 
AI467998 

TTTTTTTTTTTTTTTTTGGTCCAAAATTTTTAATAGTATACAGACAACCTGTTAATTTTT 
5 TTTTTTTTTTTTTTTGTAAATAACAAACACCACTTTGTTATGAAGACCTTACAAACCTCT 
TCTTAAGACATTCTTACTCTGATCCAGGCAAAAACACTTCAAGGTTTGTAAATGACTCTT 
TCCTGACATAAATCCTTTTTTATTAAAATGCAAAATGTTCTTCAGAATAAAACTGTGTAA 
TAATTTTTATACTTGGGAGTGCTCCTTGCACAGAGCTGTCATTTGCCAGTGAGAGCCTCC 
GACGGGGCAGGTACTGTGCCAGGGCAGCTCTGAAATTATGGATATTCTTATCCTCCTGGT 
1 0 TCCTTCGGTGCCAATGGTAACCTAATACCAGCCGCAGGGAGCGCCATTTCTCCTAAAGGG 
CTACACCACTGTCAACATTATCC 

AA885585 

TTTTTTTTTTTTTTTTTTCTTACAAAGAAAAATTTAATATTCGATGAGAGGTTGAACCAG 
1 5 GCTTAAAGCAGACATAC TAGGAAATGGTGCAGC CTGTAAGAATGC CAGTTTGTAAGTACT 
GACTTTGGAAAAGATCATCGCCTCTATCAGACACTTAGGGTCCTGGTCTGGCAATTTTGG 
CCTGATGTGATGCCACAAGACCCAACAGAGAGAGACACAGAGTCCAGGATAATGTTGACA 
GTGGTGTAGCCCTTTAGGAGAAATGGCGCTCCCTGCGGCTGGTATTAGGTTACCATTGGC 
ACCGA 

20 

AI033648 

TGTAAATAACAAACACCACTTGGTTATGAAGACCTTACAAACCTCTTCTTAAGACATTCT 
TACTCTGATCCAGGCAAAAACACTTCAAGGTTTGTAAATGACTCTTTCCTGACATAAATC 
CTTTTTTATTAAAATGCAAAATGTTCTTCAGAATAAAACTGTGTAATAATTTTTATACTT 
25 GGGAGTGCTCCTTGCACAGAGCTGTCATTTGCCAGTGAGAGCCTCCGACGGGGCAGGTAC 
TGTGCCAGGGCAGCTCTGAAATTATGGATATTCTTATCCTCCTGGTTCCTTCGGTGCCAA 
TGGTAACCTAATACCAGCCGCAGGGAGCGCCATTTCTCCTAAAGGGCTACACCACTGTCA 
ACATTATCCTGGACTC 

30 AI697633 

ATTCCTGTTAATTTTGACAAGCTCAACGGCTGAAATCTAGGAATGGTTACTACCAAAAGC 
CCACCCAATCCAGCTCATTTTGCTATCGTTTTATAACAATTAATCTGCATTATATTTGGA 
TCCAGACAAATAAAGCAATTATAAATGTATCTCACTTTACAACAGACAAAAAAAGGGCAT 
GCTATGGAAATTGTTTAAATCTCAAGCAACAATGCTGATTAATTTCTGGTCAATAATCGT 

35 TCTATAGTTCTCCTTCATGAAGCCTGGTGAGGTTCCAGGAAACAGCTTGATTTGGGAAGC 
CTCAGCAGAAAAGAAAGCATCTCAGAGGACACATAAAATGTCTGGCAACCCCTCTTGGCG 
GCCCTCATCCAGCAAAGCTTGTGTGGTCTTGGCAACTGTCCTCAGGACTCTGCTTTCAAG 
ATGAAAGAGGTGTAGCTTACCCGCTCAATACACCAAGTACAAGATTTAGTACGAAAAATG 
ACCCAAAGATGACGAGACTGACACAATACACCCAGGGCAATTCAAATCCCATAGCATCAT 

40 TCAT 
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AA523647 

GGTCGACGTATTTGTAAAGAGATGGTCTATATCTTGTAATTACTGTATTGTATTTGAACT 
GCAGCAATATCCATGGGTCCTAATAATTGTAGTTCCCCACTAAAATCTAGAAATTATTAG 
5 TATTTTTACTCGGGCTATCCAGAAGTAGAAGAAATAGAGCCAATTCTCATTTATTCAGCG 
AAAATCCTCTGGGGTTAAAATTTTAAGTTTGAAAGAACTTGACACTACAGAAATTTTTCT 
AAAATATTTTGAGTCACTATAAACCTATCATCTTTCCACAAGAAAAAAAAACAAAAAAAA 
AGTCGACG 



10 BQ710377 

CAAAGTACTTCCCCACATTTAGCTGGATTTGTCTTTGGTTTGAAGAGGCTAATACGTGAA 
AGATTTGTTCACAGTTGGATGTCCCCTTTTCTGAACCATGAAGTAATATTGTGAATGGAG 
TTGAATGCTGAGGTTAGGGTGCCGGAAAGATTCAGGGTCCTTCGGTACCCTCACATGGCT 
TGGCTTTGGTAGAACAAGAAACTAAGCTCTGATTTGGCTTTAAATGAGAGTGCTAAATTT 

1 5 CCTTTTTCTAATAAAGAACCTAGCTAAACATTTATATATACTTTTGAACACTGAACTTTC 
TTGTTGCAGAGTTAACAGCTGTTGGGGGTAGCTGACAGCTGGATCCTGGTGCTGTTGGTA 
CCATGGTACCTGAAGTGCACAGGCTGGTAGCCACACCTGACATTAACAAGTGAGTGGTAA 
CCTCTCTGCCGCTGGCTCACAGCTACTGTTTCCATAGAAATGGCTGTCGGGATCAGTGGA 
AACGAGGTAAGTGAAAGTTTTCGCTGATCCTTGTTTCCATCAAGCTGACGTCTGTTTCCC 

20 TGGCAACAGCAGTGGACAGCAGCCAGGCGCTAGCAACAGATTCAGTAGAGCTCTCACTTG 
TCAGCTGTGGCTATCATCTGTTCCTGACCAAGTTCTTTTTTTTTTTTTTAATAATGTACA 
GAAAGACCTCTGANGGACCAGGANGCNACTCTGGCCACATGTGCCCTCCTGGATGCTCGT 
TTTGCAAATGGAGAGCTGTGTGCTGAGTTGACTTCTCTGTCCGCAGTTCCCCCTCCACTG 
NGGCTCTGGGGTTGNTGATGTGCAGGTAAAAAAAAGGAGGGTTGTTGAAGGTTATTAGTT 

25 GTTCCAAGGGGAAGCCTGTTGAAACCTGGTTGATCCCCAATCCCTATGGGGAAGAAAAAT 
CTCTTTAAGGGGCTTTTCATGCCCAGAGACCCAAATTTT 



BQ706920 

GGTGGCGATTCGGACGAGGGCAAAGACTTCCCCCATTTAGCTGGATTTGTCTTTGGTTTG 
30 AAGAGGCTAATACGTGAAAGATTTGTTCACAGTTGGATGTCCCCTTTTCTGAACCATGAA 
GTAATATTGTGAATGGAGTTGAATGCTGAGGTTAGGGTGCCGGAAAGATTCAGGGTCCTT 
CGGTACCCTCACATGGCTTGGCTTTGGTAGAACAAGAAACTAAGCTCTGATTTGGCTTTA 
AATGAGAGTGCTAAATTTCCTTTTTCTAATAAAGAACCTAGCTAAACATTTATATATACT 
TTTGAACACTGAACTTTCTTGTTGCAGAGTTAACAGCTGTTGGGGGTAGCTGACAGCTGG 
35 ATCCTGGTGCTGTTGGTACCATGGTACCTGAAGTGCACAGGCTGGTAGCCACACCTGACA 
TTAACAAGTGAGTGGTAACCTCTCTGCCGCTGGCTCACAGCTACTGTTTCCATAGAAATG 
GCTGTCGGGATCAGTGGAAACGAGGTAAGTGAAAGTTTTCGCTGATCCTTGTTTCCATCA 
AGCTGACGTCTGTTTCCCTGGCAACAGCAGTGGACAGCAGCCAGGCGCTAGCAACAGATT 
CAGGAGAGCTCTCACTTGTCAGCTGTGGCTATCATCTGTTCCTGACCAAGTTCTTTTTTT 
40 TTTTTTTAATAATGGACAGAAAGACCTCTGAGGACCCAGGAGGCACCTCTGGGCACATGT 
GCCCTCCTGGATGCTCCTTTTGCAGATGGAGACCTGGGGGCTGAGTTGACTTCTCTGGCC 
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GCAGTTCCCCCTCCACCTGGGGCTCCTGGGTGGTGAGGGGCCAGGTAAAAAAAGGGAAGG 
TGTTTGAGGGTATTAATGGGTCCCCGGGCGGGCTGATCGAATCCTGGGGACTCCACGTCC 
CTGGGGGGACAAGAATCTCTTCAACGGGGTTTTCCGGCCGGGAGCCGGAGTTTTTTATTC 
AGCGGG 

5 

BQ016847 

TTTTTTTTTTTTTTTTTTCTTGTGGAAAGATGATAGGTTTATAGTGACTCAAAATATTTT 
AGAAAAATTTCTGTAGTGTCAAGTTCTTTCAAACTTAAAATTTTAACCCCAGAGGATTTT 
CGCTGAATAAATGAGAATTGGCTCTATTTCTTCTACTTCTGGATAGCCCGAGTAAAAATA 

1 0 CTAATAATTTCTAGATTTTAGTGGGGAACTACAATTATTAGGACCCATGGATATTGCTGC 
AGTTCAAATACAATACAGTAATTACAAAATATAGACCATCTCTTTACAAATACAAATTAT 
AGTATATTACAAGTCATGTACAGTAAATCTATAATTTTAAACAAACTAGTGTATCTAAGT 
TTACCTGGTTGCGAGTGCATTATTATTCCAGTTTACAGTTGCCCTTAGCGTGACAGTCAG 
AAACCGACCATCGGAGTGATATTCTCTTATGTAAACTGGCGTCACATCACAGAAAACCTT 

1 5 ATTTATGAGGTCCCATTGCCCTCGCAATAATCACTGGTAGCTGGGTTCTGACTTACTTAC 
ACACCGTATTTCAGAACAGCTAAACAGGAACCAGGACGCAGTGTATTTGGGGGAAAGGGT 
TTACAAATGGATATGTTGGGCCCAGTGACTGATATGCTAGACCGATGGCTGAGGTAACGA 
CACAGGTGTGATGATCGTCATCACCTTTAACT 

20 CA943595 

TGCAAATAAGGACAAGCTCAGCGGCTGAAATCTACAAATGGGGACTACCAAAAGCCCACC 
CAATCCAGCTCATTTTGCTATCGTTTTATAACAATTAATCTGCATTATATTTGGATCCAG 
ACAAATAAAGCAATTATAAATGTATCTCACTTTAGAACAGACAAAAAAAGGGCATGCTAT 
GGAAATTGTTTAAATCTCAAGCAACAATGCTGATTAATTTCTGGTCAATAATCGTTCTAT 

25 AGTTCTCCTTCATGAAGCCTGGTGAGGTTCCAGGAAACAGCTTGATTTGGGAAGCCTCAG 
CAGAAAAGAAAGCATCTCAGAGGACACATAAAATGTCTGGCAACCCCTCTTGGCGGCCCT 
CATCCAGCAAAGCTTGTGTGGTCTTGGCAACTGTCCTCAGGACTCTGCTTTCAAGATGAA 
AGAGGTGTAGCTTACCCGCTCAATACACCAAGTACAAGATTTAGTACGAAAAATGACCCA 
AAGATGACGAGACTGACAAAATACACCCAGGGCAATTCAAATCCCATAGCATCATTCATC 

30 TGCAAGAAATAAGATGGTCTCATAGGAGTGGGTTAATAAGAGGATTTAATAAGGA 

BM008196 

GGCAAAGTACTTCCCCACATTTAGCTGGATTGGTCTTTGGTTTGAAGAGGCTAATACGTG 
AAAGATTTGTTCACAGTTGGATGTCCCCTTTTCTGAACCATGAAGTAATATTGTGAATGG 

35 AGTTGAATGCTGACGGTTAGGGTGCCGGAAAGATTCAGGGTCCTTCGGTACCCTCACATG 
GCTTGGCTTTGGTAGAACAAGAAACTAAGCTCTGATTTGGCTTTAAATGAGAGTGCTAAA 
TTTCCTTTTTCTAATAAAGAACCTAGCTAAACATTTATATATACTTTTGAACACTGAACT 
TTCTTGTCAGCAGAGTTAACAGCTGTAGGGGGTAGCTGACACGGCTGGATCCTGGTGCTG 
TTGGTACCATGGTACCTGAAGTGCACAGGCTGGTAGCCACACCTGACATTAACAACGTGA 

40 GTGGTAACCTCTCTGCCGCTGGCTCACAGCTACTGTTTCCATCAGAAATGGCTGTCGGGC 
TCACGTGGAAACGAGGTAAGTGAAAGTACGCTAGATCCTTGTTCCATCACAGCTGACGCT 
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CTGTTTCCCATGGCAACACCCAGCACGGACAAGCCGCCACGCCGCATAGACAACCACAAC 
CACGTACAGCTCTCCACAAGTCAGCTCGTGGCTATCCATCATGTCCCTGAACAAGCCCAC 
ACCACCCCCCCCCAAGCGACACAGCAACGAGCACCACCCGGACGAACCAAAGGACGGACC 
CCCCTGCCCCAACCTCTCGCCCATCCGCGACAGACCCGCCAAGCAAACACGACAACCTAA 
5 CAAAGCAGAGGGACAGACCCATAGCGCCCGCTACCGGAAGCGTACACCACTTCCCAACAG 
TAAGGCCAAAAGAGCGACGCGGAGCACGTGAACGGATAAGAAAACGAGAGAAGGCACGGC 
CGCATGGCAAACACACCAGCAAGCAGCAGACAGCACGTGGGCACGACACAGGACAGAAAG 
CAGCCCACCTCAGAGGGGACCAACGAAGAGTCGCACGAC 

10 BI769856 

CTGGGCCCAACATATCCATTTTTAAACCCTTTCCCCCAAATACACTGCGTCCTGGTTCCT 
GTTTAGCTGTTCTGAAATACGGTGTGTAAGTAAGTCAGAACCCAGCTACCAGTGATTATT 
GCGAGGGCAATGGGACCTCATAAATAAGGTTTTCTGTGATGTGACGCCAGTTTACATAAG 
AGAATATCACTCCGGTGGTCGGTTTCTGACTGTCACGCTAAGGGCAACTGTAAACTGGAA 

1 5 TAATAATGCACTCGCAACCAGGTAAACTTAGATACACTAGTTTGTTTAAAATTATAGATT 
TACTGTACATGACTTGTAATATACTATAATTTGTATTTGTAAAGAGATGGTCTATATTTT 
GTAATTACTGTATTGTATTTGAACTGCAGCAATATCCATGGGTCCTAATAATTGTAGTTC 
CCCACTAAAATCTAGAAATTATTAGTATTTTTACTCGGGCTATCCAGAAGTAGAAGAAAT 
AGAGCCAATTCTCATTTATTCAGCGAAAATCCTCTGGGGTTAAAATTTTAAGTTTGAAAG 

20 AACTTGACACTACAGAAATTTTTCTAAAATATTTTGAGTCACTATAAACCTATCATCTTT 
CCACAAGATATACCAGATGACTATTTGCAGTCTTTTCTTTGGGCAAGAGTTCCATGATTT 
TGATAC TGTACCTTTGGATC CAC CATGGGTTGCAN 

BI758971 

25 GGAAAAGAAATACTGTTTTAGAGAAATAACATTTTCAACAAAACATCCCTGGAGTCAGAT 
TTTGAGTTGGGGTGGGCTAATCAGGGAGTCGGGGCTCTCTGCGTGATGTCAGTTCTATGG 
CTAACTGGTTTTTCTAAACCAGCCAGCTGCCTATCAAAACAGTACAACTTTTCTAGGAAA 
TGCAATTGGCAAAGACACTTACGATGCTGAGAAGTACACAAGGTGAAACTGCTCCAGTTT 
TTCTCATAGCAGGGTCAGCAGGAAAGCAAGTGGTGCCCCTGGTCCCATCTCACACAGGTG 

30 AGACTGCACCGAGAGGTAACGTGGCCCTCACAGCCCACCACGCCTGGCCTTCGCCCAATT 
CTGAAACTTCGTAGGATAGAGCTGGAAAGTGCCACATGGTGAAGCGAGATCCAGCTGTCT 
GGGTGGATGTCGGAGTCCATAGGCTGAGCAGAGATGGTTCTTAGTGAGGTTCTCGCTGCC 
AGTTGACGGTGAAATCATAGCTGCCATTTACATTTTGTGAGATTATGAAAAACATAAGAC 
TAAAGAAACTAAATGTGTTATTCCTGTGGACACAAAAATGTGTGTTTTTCAGATGGGGAG 

35 GGGACCAAAAAGGAAAAACATTTCATCTTAAAACTTTCCTAAGACAAAGGAAAACAAAAA 
ACCATGCTCTACAACTTCAAATTTTTCTTACAAAGAAAAATTTAATATTCGATGAGCAGG 
TTGAACCAGGCTTAAAGCAGACATACTAGGAAATGGTGCAGCCTGTAAGAATGCCAGTTT 
GTAAGTACTGACTTTGGAAAAGATCATCGCTCTATCAGACACTTAGGGTCCTGGTCTGGC 
CATTTTGGCCTGATGTGATGCCAAAAGACC 

40 

AA468565 
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TTTTATCGTGTGGAAAGATGATAGGTTTATAGTGACTCAAAATATTTTAGAAAAATTTCT 
GTAGTGTCAAGTTCTTTCAAACTTAAAATTTTAACCCCAGAGGATTTTCGCTGAATAAAT 
GAGAATTGGCTCTATTTCTTCTACTTCTGGATAGCCCGAGTAAAAATACTAATAATTTCT 
AGATTTTAGTGGGGAACTACAATTATTAGGACCCATGGATATTGCTGCAGTTCAAATACA 
5 ATACAGTAATTACAAAATATAGACCATCTCTTTACAAATACAAATTATAGTATATTACAA 
GTCATGTACAGTAAATCTATTTTAAACAAACTAGTGTATCTAAGTTTACCTGGTTGCGAG 
TGCATTAT 

AA437099 

10 CTTACAAAGAAAAATTTAATATTCGATGAGAGGTTGAACCAGGCTTAAAGCAGACATACT 
AGGAAATGGTGCAGCCTGTAAGAATGCCAGTTTGTAAGTACTGACTTTGGAAAAGATCAT 
CGCCTCTATCAGACACTTAGGGTCCTGGTCTGGCAATTTTGGCCTGATGTGATGCCACAA 
GACCCAACAGAGAGAGACACAGAGTCCAGGATAATGTTGACAGTGGTGTAGCCCTTTAGG 
AGAAATGGCGCTCCCTGCGGCTGGTATTAGGTTACCATTGGCACCGAAGAGACCAGGAGG 

1 5 ATAAGAATATCCATAATTTCAGAGCTGCCCTGGCACAGTACCTGCCCCGTCGGAGGCTCT 
CACTGGCAAATGACAGCTCTGTGCAAGGAGCACTCCCAAGTATAAAAATTAT 

CA867864 

CCGCGTCCGGTCAGATGGTACAAGTTTGTCTCTATAATTAAGACTTTTCCACCATCACAA 
20 ACTTTAAACACAAAGTCTAAAATCTTGGGCAGCATAGAAAATAGGTTCTAGCTAAGCAGG 
AGTTTTGTCCTCTACCAAGACCTTTCCTGAAAATCACTTATCAAGACAGTTTCCTGTAAG 
AAAAAGCCATATCCCAGCTGATTTTCCTTCCTGGGGCCAAAATCTGCTATTATTCGGCCT 
GAAAGCCTTGATGACTCTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGT 
GTATGGATGCTTGTGTGTGTGTATGGGGAATATGTGATTAATGTGTGTTGGCTGCTGTTG 
25 TCTCTGATTTGGCTACTGTTGTTTCTGATTTAAATCTAAGTAAATGTTTAATTAAATGTA 
TAGAATGCTGTCTCTAATGTGACCCTCTCTCCTTATTAAATCCTCTTATTAACCCACTCC 
TATGAGACCATCTTATTTCTTGCAGATGAATGATGCTATGGGATTTGAATTGCCCTGGGT 
GTATTTTGTCAGTCTCGTCATCTTTGGGTCATTTTTCGTACTAAATCTTGTACTTGGTGT 
ATTGAGCGGG 

30 

AA682690 

AATGCAAAATGTTCTTCAGAATAAAACTGTGTAATAATTTTTATACTTGGGATGTGCTCC 
TTGCACAGAGCTGTCATTTGCCAGTGAGAGCCTCGACAGGCAGGTACTGTGCCAGGGCAG 
CTCTGAAATTATGGATATTCTTATCCTCCTGGTTCCTTCTGTGCTCAATGGTAACCTAAT 
35 ACCAGCCGCAGGACNCGCCATTTCTCCTAAAGGGCTACACCACTGTCAACATTATC 

AA701888 

TCAGCGAAAATCCTCTGGGGTTAAAATTTTAAGTTTGAAAGAACTTGACACTACAGAAAT 
TTTTCTAAAATATTTTGAGTCACTATAAACCTATCATCTTTCCACAAGATATACCAGATG 
40 ACTATTTGCAGTCTTTTCTTTGGGCAAGAGTTCCATGATTTTGATACTGTACCTTTGGAT 
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CCACCATGGGTTGCAACTGTCTTTGGTTTTGTTTGTTTGACTTGAACCACCCTCTGGTAA 
GTAAGTAAGTGAATTACAGAGCAGGTCCAGCTGGCTGCTCTGCCCCTTGGGTATCCATAG 
TTACGGTTTTCTCTGTGGCCCACCCAGGGTGTTTTTTGCATCGCTGGTGCAGAAATGCAT 
AGGTGGATGAGATATAGCTGCTCTTGTCCTCTGGGGACTGGTGGTGCTGCTTAAGAAATA 
5 AGGGGTG 

BU 182632 

TTTTGTCAGTCTCGTCATCTTTGGGTCATTTTTCGTACTAAATCTTGTACTTGGTGTATT 
GAGCGGGCACAGTGGCTCACGCCTATAATCCCAGCACTTTCGGAGGCCGAGGCAGCTGGA 

1 0 CCACCCGAGATCAGGAGTTTGAGACCAGCCTGACTAAGGCAGTGAAACCCTGTCTCTACT 
AAAAATACAAAAATTAGCCAGGCATGGTGGCGCATGCCTGTAATCCCAGCTACTTGGGAG 
GCTGAGGCAGGAGAATCACTTGAACCAGGGAGGTGGAGATTGCAGTGAGCCAAGACTGCA 
CCATTGCATTCCAGCCTGGGTGACAAGAGCAAAACTCCATCTCAAAAAAAAAAAAAAAAA 
AAAAAAAAAAAGACTTTTCTCTCATTCAACACTTTACCAGCATCTACTGACAGAAAATGG 

1 5 ACAATTGAATTTCCTCCAATATATATACCTCTGATATGTCTGCTTTGTAAAAGAGTAGTG 
TAATTGCTTACAACATTGAAAAGGTTGTTATTGGGGTCCTGGGGTAGCCAGGATATCGGC 
ATGATTTGTCACCATATTCAGAATAAAACTGTACTGCAATAGTGAGTTAATTCCATATCT 
TGGCCAACAGAGAATTTTTGGCCAGTGGCTACTAAGGCACACGGAAGTCCAGTCTAAAAG 
GGACAGGGGAGGACTCTTTGTAGATAGTTCTTATGATTAAAAAATAACTTCCTATGTGTT 

20 GTAGTGATGATTTAAGCTGACAGAATGCTAAAGACACCCCTTATGATTACCTGGTAGCAA 
AGTACCTTCCCCACATTTAACCTGGATTTGCCCTTTTGGGTTTGAAAGAGGCTAAATA 

BQ898429 

GGTGGGATTCGGCACGAGGGCAAGACTTCCCCACATTTAGCTGGATTTGTCTTTGGTTTG 
25 AAGAGGCTAATACGTGAAAGATTTGTTCACAGTTGGATGTCCCCTTTTCTGAACCATGAA 
GTAATATTTGTGATATGGAGTTCGAATGGCTGAGGTCTAGGTGTGCCGAGAAAGATTCAG 
GGTCCTTCGGTACCCTCACATGGCTTGGCTTTGGTAGAACAAGAAACTAAGCTCTGATTT 
GGCTTTAAATGAGAGTGCTAAATTTCCTTTTTCTAATAAAGAACCTAGCTAAACATTTAT 
ATATACTTTTGAACACTGAACTTTCTTGTTGCAGAGTTAACAGCTGTTGGGGGTAGCTGA 
30 CAGCTGGATCCTGGTGCTGTTGGTACCATGGTACCTGAAGTGCACAGGCTGGTAGCCACA 
CCTGACATTAACAAGTGAGTGGTAACCTCTCTGCCGCTGGCTCACAGCTACTGTTTCCAT 
AGAAATGGCTGTCGGGATCAGTGGAAACGAGGTAAGTGAAAGTTTTCGCTGATCCTTGTT 
TCCATCAAGCTGACGTCTGTTTCCCTGGCAACAGCAGTGGACAGCAGCCAGGCGCTAGCA 
ACAGATTCAGTAGAGCTCTCACTTGTCAGCTGTGGCTATCATCTGTTCCTGACCAAGTTC 
35 TTTTTTTTTTTTTTAATAATGTACAGAAAGACCTCTGAGGACCCAGGAGGCACCTCTGGC 
CACATGTGCCCTCCTGGATGCTCGTTTTGCAGATGGAGAGCTGTGTGCTGAGTTGACTTC 
TCTGTCCGCAGTTCCCCCTCCACCTGTGCTCTGGGTTGTTGATGTGCCAGTTAAAACAGG 
GAGGCTGCTTCAGGGTATTAGTGTTGCCAAGGGGAGGCTGTTGAAATCTGGTTGATCCCA 
AATC 

40 

BQ711800 
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CAAAGTACTTCCCCACATTTAGCTGGATTTGTCTTTGGTTTGAAGAGGCTAATACGTGAA 
AGATTTGTTCACAGTTGGATGTCCCCTTTTCTGAACCATGAAGTAATATTGTGAATGGAG 
TTGAATGCTGAGGTTAGGGTGCCGGAAAGATTCAGGGTCCTTCGGTACCCTCACATGGCT 
TGGCTTTGGTAGAACAAGAAACTAAGCTCTGATTTGGCTTTAAATGAGAGTGCTAAATTT 
5 CCTTTTTCTAATAAAGAACCTAGCTAAACATTTATATATACTTTTGAACACTGAACTTTC 
TTGTTGCAGAGTTAACAGCTGTTGGGGGTAGCTGACAGCTGGATCCTGGTGCTGTTGGTA 
CCATGGTACCTGAAGTGCACAGGCTGGTAGCCACACCTGACATTAACAAGTGAGTGGTAA 
CCTCTCTGCCGCTGGCTCACAGCTACTGTTTCCATAGAAATGGCTGTCGGGATCAGTGGA 
AACGAGGTAAGTGAAAGTTTTCGCTGATCCTTGTTTCCATCAAGCTGACGTCTGTTTCCC 

1 0 TGGCAACAGCAGTGGACAGCAGCCAGGCGCTAGCAACAGATTCAGTAGAGCTCTCACTTG 
TCAGCTGTGGCTATCATCTGTTCCTGACCAAGTTCTTTTTTTTTTTTTTAATAATGTACA 
GAAAGACCTCTGAGGACCCAGGGAGCACCTCTGGCCACATGTGCCCTCCTGAATGCTCGT 
TTTGCAAATGGAGAGCTGTGTGCTGAGTTGACTTCTCTGTCCGCAGGTCCCCCTCCAACT 
GTGCTCCTGGGTTGTGATGTGCAGGGTTAAACCAGGGAAGCTGTTGAAGGGTATTAGTGT 

1 5 TGCCAGGGAAAGGCTGTTGAATTCTGGTTGATCCCAAATCCCTAGGGGGAAGAGAAATCC 
CTTACGAGTGGTTTTTCATGGCCAGGAACCCTATA 

AA703120 

TCAGCGAAAATCCTCTGGGGTTAAAATTTTAAGTTTGAAAGAACTTGACACTACAGAAAT 
20 TTTTCTAAAATATTTTGAGTCACTATAAACCTATCATCTTTCCACAAGATATACCAGATG 
ACTATTTGCAGTCTTTTCTTTGGGCAAGAGTTCCATGATTTTGATACTGTACCTTTGGAT 
CCACCATGGGTTGCAACTGTCTTTGGTTTTGTTTGTTTGACTTGAACCACCCTCTGGTAA 
GTAAGTAAGTGAATTACAGAGCAGGTCCAGCTGGCTGCTCTGCCCCTTGGGTATCCATAG 
TTACGGTTTTCTCTGTGGCCCACCCAGGGTGTTTTTTGCATCGCTGGTGCAGAAATGCAT 
25 AGGTGGATGAGATATAGCTGCT 

AA978315 

GTATATCTTGTGGAAAGATGATAGGTTTATAGTGACTCAAAATATTTTAGAAAAATTTCT 
GTAGTGTCAAGTTCTTTCAAACTTAAAATTTTAACCCCAGAGGATTTTCGCTGAATAAAT 

30 GAGAATTGGCTCTATTTCTTCTACTTCTGGATAGCCCGAGTAAAAATACTAATAATTTCT 
AGATTTTAGTGGGGAACTACAATTATTAGGACCCATGGATATAGCTGCAGTTCAAATACA 
ATACAGTAATTACAAAATATAGACCATCTCTTTACAAATACAAATTATAGTATATTACAA 
GTCATGTACAGTAAATCTATAATTTTAAACAAACTAGTGTATCTAAGTTTACCAGGTTGC 
GAGTGCATTATTATTCCAGTTTACAGTTGCCCTTAGCGTGACAGTCAGAAACCGACCATC 

35 GGAGTGATATTCTCTTATGTAAACAGGCGTCACATCACAGA 

BE550599 

TTTTTTTTTTTGTGGAAAGATGATAGGTTTATAGTGACTCAAAATATTTTAGAAAAATTT 
CTGTAGTGTCAAGTTCTTTCAAACTTAAAATTTTAACCCCAGAGGATTTTCGCTGAATAA 
40 ATGAGAATTGGCTCTATTTCTTCTACTTCTGGATAGCCCGAGTAAAAATACTAATAATTT 
CTAGATTTTAGTGGGGAACTACAATTATTAGGACCCATGGATATTGCTGCAGTTCAAATA 
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CAATACAGTAATTACAAAATATAGACCATCTCTTTACAAATACAAATTATAGTATATTAC 
AAGTCATGTACAGTAAATCTATAATTTTAAACAAACTAGTGTATCTAAGTTTACCTGGTT 
GCGAGTGCATTATTATTCCAGTTTACAGTTGCCCTTAGCGTGACAGTCAGAAACCGACCA 
TCGGAGTGATATTCTCTTATGTAAACTGGCGTCACATCACAGAAAACCTTATTTATGAGG 
5 TCCCATTGCCCTCGCAATAATCACTGGTAGCTGGGTTCTGACTTACTTACACACCGTATT 
TCAGAACAGCTAAACAG 

BE502741 

TTTGGTATATCTTGTGGAAAGATGATAGGTTTATAGTGACTCAAAATATTTTAGAAAAAT 
1 0 TTCTGTAGTGTCAAGTTCTTTCAAACTTAAAATTTTAACCCCAGAGGATTTTCGCTGAAT 
AAATGAGAATTGGCTCTATTTCTTCTACTTCTGGATAGCCCGAGTAAAAATACTAATAAT 
TTCTAGATTTTAGTGGGGAACTACAATTATTAGGACCCATGGATATTGCTGCAGTTCAAA 
TACAATACAGTAATTACAAAATATAGACCATCTCTTTACAAATACAAATTATAGTATATT 
ACAAGTCATGTACAGTAAATCTATAATTTTAAACAAACTAGTGTATCTAAGTTTACCTGG 
1 5 TTGCGAGTGCATTATTATTCCAGTTTACAGTTGCCCTTAGCGTGACAGTCAGAAACCGAC 
CATCGGAGTGATATTCTCTTATGTAAACTGGCGTCACATCACAGAAAACCTTATTTATGA 
G 

AW872382 

20 TTTTTTTGTGGAAAGATGATAGGTTTATAGTGACTCAAAATATTTTAGAAAAATTTCTGT 
AGTGTCAAGTTCTTTCAAACTTAAAATTTTAACCCCAGAGGATTTTCGCTGAATAAATGA 
GAATTGGCTCTATTTCTTCTACTTCTGGATAGCCCGAGTAAAAATACTAATAATTTCTAG 
ATTTTAGTGGGGAACTACAATTATTAGGACCCATGGATATTGCTGCAGTTCAAATACAAT 
ACAGTAATTACAAAATATAGACCATCTCTTTACAAATACAAATTATAGTATATTACAAGT 

25 CATGTACAGTAAATCTATAATTTTAAACAAACTAGTGTATCTAAGTTTACCTGGTTGCGA 
GTGCATTATTATTCCAGTTTACAGTTGCCCTTAGCGTGACAGTCAGAAACCGACCATCGG 
AGTGATATTCTCTTATGTAAACTGGCGTCACATCACAGAAAACCTT 

AW444663 

30 CGGCCGCCAACTTTTTTGAATGAGTGAAGTGCCAGGTACCATGAGAAAACCCTAGCTGGT 
AAAGATCAAACCTGAGTTAGTTCTAAATTCACATACGGATTTTTTTTGCATGACGAAATC 
TATTCTCTTTTTCCTGACAACTTCTCCACCTAGATGTTTGGGAAAGTTGCCATGAGAGAT 
AACAACCAGATCAATAGGAACAATAACTTCCAGACGTTTCCCCAGGCGGTGCTGCTGCTC 
TTCAGGTGACTGCAACTGGCTTGGGCGGTGCTCCTGGGCAGGGGGGTCCGCTAGGCGTGG 

35 GTCCAGAGGGACGGAGGACACAGGTTATTAAAGCAGTGTGCCTTTCTCAGTTG 

AW341279 

TAAATAACTAACACCATTTTGTTATGAAGACCTTACAAACCTCTTCTTAAGACATTCTTA 
CTCTGATCCAGGCAAAAACACTTCAAGGTTTGTAAATGACTCTTTCCTGACATAAATCCT 
40 TTTTTTATTAAAATGCAAAATGTTCTTCAGAATAAAACTGTGTAATAATTTTTATACTTG 
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GGAGTGCTCCTTGCACAGAGCTGTCATTTGCCAGTGAGAGCCTCCGACGGGGCAGGTACT 
GTGCCAGGGCAGCTCTGAAATTATGGATATTCTTATCCTCCTGGTTCCTTCGGTGCCAAT 
GGTAACCTAATACCAGCCGCAGGGAGCGCCATTTCTCCTAAAGGGCTACACCACTGTCAA 
CATTATCCTGGACTCTGTGTCTCTCTCTGTTGGGTCTTGTGGCATCACATCAGGCCAAAA 
5 TTGCCAGACCAGGACCCTAAGTGTCTGATAGAGGCGATGATCTTTTCCAAAGTCAGTACT 
TACAAACTGGCATTCTTACAGGCTGCACCATTTCCTAGTATGTCTG 

CF456750 

ACTTTTCTAGGAAATGCAATTGGCAAAGACACTTACGATGCTGAGAAGTACACAAGGTGA 
AACTGCTCCAGTTTTTCTCATAGCAGGGTCAGCAGGAAAGCAAGTGGTGCCCCTGGTCCC 
ATCTCACACAGGTGAGACTGCACCGAGAGGTAACGTGGCCCTCACAGCCCACCACGCCTG 
GCCTTCGCCCAATTCTGAAACTTCGTAGGATAGAGCTGGAAAGTGCCACATGGTGAAGCG 
AGATCCAGCTGTCTGGGTGGATGTCGGAGTCCATAGGCTGAGCAGAGATGGTTCTTAGTG 
AGGTTCTCGCTGCCAGTTGACGGTGAAATCATAGCTGCCATTTACATTTTGTGAGATTAT 
GAAAAACATAAGACTAAAGAAACTAAATGTGTTATTCCTGTGGACACAAAAATGTGTGTT 
TTTCAGATGGGGAGGGGACCAAAAAGGAAAAACATTTCATCTTAAAACTTTCCTAAGACA 
AAGGAAAACAAAAAACCATGCTCTACAACTTCAAATTTTTCTTACAAAGAAAAATTTAAT 
ATTCGATGAGAGGTTGAACCAGGCTTAAAGCAGACATACTAGGAAATGGTGCAGCCTGTA 
AGAATGCCAGTTTGTAAGTACTGACTTTGGAAAAGATCATCGCCTCTATCAGACACTTAG 
GGTCCTGGTCTGGCAATTTTGGCCTGATGTGATGCCACAAGACCCAACAGAGAGAGACAC 
AGAGTCCAGGATAATGTTGACAGTGGTGTA 

AW 139850 

TTTTTTTTTTTTTTTTTAGAAGAAATAGAGCCAATTCTCATTTATTCAGCGAAAATCCTC 
TGGGGTTAAAATTTTAAGTTTGAAAGAACTTGACACTACAGAAATTTTTCTAAAATATTT 
TGAGTCACTATAAACCTATCATCTTTCCACAAGATATACCAGATGACTATTTGCAGTCTT 
TTCTTTGGGCAAGAGTTCCATGATTTTGATACTGTACCTTTGGATCCACCATGGGTTGCA 
ACTGTCTTTGGTTTTGTTTGTTTGACTTGAACCACCCTCTGGTAAGTAAGTGAATTACAG 
AGCAGGTCCAGCTGGCTGCTCTGCCCCTTGGGTATCCATAGTTACGGTTTTCTCTGTGGC 
CCACCCAGGGTGTTTTTTGCATCGCTGGTGCAGAAATGCACAGGTGGATGAGATATAGCT 
GCTCTTGTCCTC 

AW029633 

TTATCTTGTGGAAAGATGATAGGTTTATAGTGACTCAAAATATTTTAGAAAAATTTCTGT 
35 AGTGTCAAGTTCTTTCAAACTTAAAATTTTAACCCCAGAGGATTTTCGCTGAATAAATGA 
GAATTGGCTCTATTTCTTCTACTTCTGGATAGCCCGAGTAAAAATACTAATAATTTCTAG 
ATTTTAGTGGGGAACTACAATTATTAGGACCCATGGATATTGCTGCAGTTCAAATACAAT 
ACAGTAATTACAAAATATAGACCATCTCTTTACAAATACAAATTATAGTATATTACAAGT 
CATGTACAGTAAATCTATAATTTTAAACAAACTAGTGTATCTAAGTTTACCTGGTTGCGA 
40 GTGCATTATTATTCCAGTTTACAGTTGCCCTTAGCGTGACAGTCAGAAACCGACCATCGG 
AGTGATATTCTCTTATGTAAACTGGCGTCACATCACAGAAAACCTTATTTATGAGGTCCC 
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ATTGCCCTCGCAATAATCACTG 
AI963788 

TTTTTCTTGTGGAAAGATGATAGGTTTATAGTGACTCAAAATATTTTAGAAAAATTTCTG 
5 TAGTGTCAAGTTCTTTCAAACTTAAAATTTTAACCCCAGAGGATTTTCGCTGAATAAATG 
AGAATTGGCTCTATTTCTTCTACTTCTGGATAGCCCGAGTAAAAATACTAATAATTTCTA 
GATTTTAGTGGGGAACTACAATTATTAGGACCCATGGATATTGCTGCAGTTCAAATACAA 
TACAGTAATTACAAAATATAGACCATCTCTTTACAAATACAAATTATAGTATATTACAAG 
TCATGTACAGTAAATCTATAATTTTAAACAAACTAGTGTATCTAAGTTTACCTGGT 

10 

AI951788 

ATCTTGTGGAAAGATGATAGGTTTATAGTGACTCAAAATATTTTAGAAAAATTTCTGTAG 
TGTCAAGTTCTTTCAAACTTAAAATTTTAACCCCAGAGGATTTTCGCTGAATAAATGAGA 
ATTGGCTCTATTTCTTCTACTTCTGGATAGCCCGAGTAAAAATACTAATAATTTCTAGAT 
1 5 TTTAGTGGGGAACCTACAATTATTAGGACCCATGGATATTGCTGCAGTTCAAATACAATA 
CAGTAATTACAAAATATAGACCATCTCTTTACAAATACAAATTATAGTATATTACAAGTC 
ATGTACAGTAAATCTATAATTTTAAACAAACTAGTGTATCTAAGTTTACGTGGTTGCGAG 
TGCATTATTATTCCAGTTTACAGTTGCCCTTAGCGTGACAGTCAGAAACCGACCATCGGA 
GTGATATTCTCTTATGTAAACT 

20 

AI680744 

TTTTCTTCAAATAATTACAAGCTCAGCGGCTGAAATCTACAAATGGGGACTACCAAAAGC 
CCACCCAATCCAGCTCATTTTGCTATCGTTTTATAACAATTAATCTGCATTATATTTGGA 
TCCAGACAAATAAAGCAATTATAAATGTATCTCACTTTAGAACAGACAAAAAAAGGGCAT 

25 GCTATGGAAATTGTTTAAATCTCAAGCAACAATGCTGATTAATTTCTGGTCAATAATCGT 
TCTATAGTTCTCCTTCATGAAGCCTGGTGAGGTTCCAGGGAAACAGCTTGATTTGGGAAG 
CCTCAGCAGAAAAGAAAGCATCTCAGAGGACACATAAAATGTCTGGCAACCCCTCTTGGC 
GGCCCTCATCCAGCAAAGCTTGTGTGGTCTTGGCAACTGTCCTCAGGACTCTGCTTTCAA 
GATGAAAGAGGTGTAGCTTACCCGCTCAATACACCAAGTACAAGATTTAGTACGAAAAAT 

30 GACCCAAAGATGACGAGACTGACAAAATACACCCAGGGCAATTCAAATCCCATAGCATCA 
TTCATCTGCAAG 

AI601252 ' 

TTTGTAAATAACAAACACCACTTTGTTATGAAGACCTTACAAACCTCTTCTTAAGACATT 
35 CTTACTCTGATCCAGGCAAAAACACTTCAAGGTTTGTAAATGACTCTTTCCTGACATAAA 
TCCTTTTTTATTAAAATGCAAAATGTTCTTCAGAATAAAACTGTGTAATAATTTTTATAC 
TTGGGAGTGCTCCTTGCACAGAGCTGTCATTTGCCAGTGAGAGCCTCCGACGGGGCAGGT 
ACTGTGCCAGGGCAGCTCTGAAATTATGGATATTCTTATCCTCCTGGTTCCTTCGGTGCC 
AATGGTAACCTAATACCAGCCGCAGGGAGCGCCATTTCTCCTAAAGGGCTACACCACTGT 
40 CAACATTATCCTGGACTCTGTGTCTCTCTCTGTTGGGTCTTGTGGCATCACATCAGGCCA 
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AAATTGCCAGACCAGGACCCTAAGTGTCTGATAGAGGCGATGATCTTTTCCAAAGTCAGT 
ACTTACAAACT 

AI459166 

5 TTTTTTTTTGGTCCAAAATTTTTAA.TAGTATACAGACAACCTGTTAATTTTTTTTTTTTT 
TTTTTTTGTAAATAACAAACACCACTTTGTTATGAAGACCTTACAAACCTCTTCTTAAGA 
CATTCTTACTCTGATCCAGGCAAAAACACTTCAAGGTTTGTAAATGACTCTTTCCTGACA 
TAAATCCTTTTTTATTAAAATGCAAAATGTTCTTCAGAATAAAACTGTGTAATAATTTTT 
ATACTTGGGAGTGCTCCTTGCACAGAGCTGTCATTTGCCAGTGAGAGCCTCCGACGGGGC 
10 AGGTACTGTGCCAGGGCAGCTCTGAAATTATGGATATTCTTATCCTCCTGGTTCCTTCGG 
TGCCAATGGTAACCTAATACCAGCCGCAGGGAGCGCCATTT 

AA885750 

TCGACAGCTACCAGTGATTATTGCGAGGGCAATGGGACCTCATAAATAAGGTTTTCTGTG 
15 ATGTGACGCCATTTACATAAGAGAATATCACTCCGATGGTCGGTTTCTGACTGTCACGCT 
AAGGGCAACTGTAAACTGGAATAATAATGCACTCGCAACCAGGTAAACTTAGATACACTA 
GTTTGTTTAAAATTATAGATTTACTGTACATGACTTGTAATATACTATAATTTGTATTTG 
TAAAGAGATGGTCTATATTTTGTAATTACTGTATTGTATTTGAACTGCAGCAATATCCAT 
GGGTCCTAATAATTGTAGTTCCCCACTAAAATCTAGAAATTATTAGTATTTTTACTCGGG 
20 CTATCCAGAAGTAGAAGAAATAGAGCC 

BX092736 

GAATATGTGATTAATGTGTGTTGGCTGCTGTTGTCTCTGATTTGGCTACTGTTGTTTCTG 
ATTTAAATCTAAGTAAATGTTTAATTAAATGTATAGAATGCTGTCTCTAATGTGACCCTC 

25 TCTCCTTATTAAATCCTCTTATTAACCCACTCCTATGAGACCATCTTATTTCTTGCAGAT 
GAATGATGCTATGGGATTTGAATTGCCCTGGGTGTATTTTGTCAGTCTCGTCATCTTTGG 
GTCATTTTTCGTACTAAATCTTGTACTTGGTGTATTGAGCGGGT7LAGCTACACCTCTTTC 
ATCTTGAAAGCAGAGTCCTGAGGACAGTTGCCAAGACCACACAAGCTTTGCTGGATGAGG 
GCCGCCAAGAGGGGTTGCCAGACATTTTATGTGTCCTCTGAGATGCTTTCTTTTCTGCTG 

30 AGGCTTCCCAAATCAAGCTGTTTCCTGGAACCTCACCAGGCTTCATGAAGGAGA 

BX1 14568 

TTTGTAAATAACAAACACCACTTTGTTATGAAGACCTTACAAACCTCTTCTTAAGACATT 
CTTACTCTGATCCAGGCAAAAACACTTCAAGGTTTGTAAATGACTCTTTCCTGACATAAA 

35 TCCTTTTTTATTAAAATGCAAAATGTTCTTCAGAATAAAACTGTGTAATAATTTTTATAC 
TTGGGAGTGCTCCTTGCACAGAGCTGTCATTTGCCAGTGAGAGCCTCCGACGGGGCAGGT 
ACTGTGCCAGGGCAGCTCTGAAATTATGGATATTCTTATCCTCCTGGTTCCTTCGGTGCC 
AATGGTAACCTAATACCAGCCGCAGGGAGCGCCATTTCTCCTAAAGGGCTACACCACTGT 
CAACATTATCCTGGACTCTGTGTCTCTCTCTGTTGGGTCTTGTGGCATCACATCAGGCCA 

40 AAATTGCCAGACCAGGACCCTAAGTGTCTGATAGAGGCGATGATCTTTTCCAAAGTCAGT 
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ACTTACAAACTGGCATTCTTACAGGCTGCACCATTTCCTAGTATGTCTGCTTTAAGCCTG 
GTTCAACCTCTCATCGAATATTAAATTTTTCTTTGTAAGAAAAAAAAAAAAAAA 

BE672659 

5 TTTGTAAATAACAAACACCACTTTGTTATGAAGACCTTACAAACCTCTTCTTAAGACATT 
CTTACTCTGATCCAGGCAAAAACACTTCAAGGTTTGTAAATGACTCTTTCCTGACATAAA 
TCCTTTTTTATTAAAATGCAAAATGTTCTTCAGAATAAAACTGTGTAATAATTTTTATAC 
TTGGGAGTGCTCCTTGCACAGAGCTGTCATTTGCCAGTGAGAGCCTCCGACAGGGCAGGT 
ACTGTGCCAGGGCAGCTCTGAAATTATGGATATTCTTATCCTCCTGGTTCCTTCGGTGCC 
1 0 AATGGTAACCTAATACCAGCCGCAGGGAGCGCCATTTCTCCTAAAGGGCTACACCACTGT 
CAACATTATCCTGGACTCTGTGTCTCTCTCTGTTGGGTCTTGTGGCATCACATCAGGCCA 
AAATTGCCAGACCAGGACCCTAAGTGTCTGATAGAGGCGATGATCTTTTCCAAAGTCAGT 
ACTTACAAACTGGCATTCTTACAGGCTGCACCATTTCCTAGTATGTCTGCTTTAAGCCTG 
GTTCAACC 

15 

N78509 

GGAGAAAGGAGGGAAACCAGGAGCAGCCGGCATGGGCAGTGGCAGAATTGGCCCTGNTAG 
AGAGCAGAGCTGATGCCATCCTTTTGGCAAATAGCTGACATTTTATGGTGTGGTGCTGGG 
TGAGCCCCCTGTGAGGGTTGAACAGATGTGGACAGGACTTGGGTCCAGGCACTAGAGTGG 
20 TGCAGCCTGTAAGAATGCCAGTTTGTAAGTACTGACTTTGGAAAAGATCATCGCCTCTAT 
CAGACACTTAGGGTCCTGGTCTGGCAATTTTGGCCTGATGTGATGCCACAAGACCCAACA 
GAGAGAGACACAGAGTCCAGGATNAATGTTGACAGTGGTGTAGCCTTTAGGAAGAAATGG 
CGCTCCCTGCGGCTGGTATTAGGTTACCATTGGCANCCGAAGGAACCCAGGAGGATTAAG 
AATTTCCCTAATTTCAGAACTTGCCCTGGCACAGTA 

25 

N73668 

GGTCCAAAATTTTTAATAGTATACAGACAACCTGTTAATTTTTTTTTTTTTTTTTTTTGT 
AAATAACAAACACCACTTTGTTATGAAGACCTTACAAACCTCTTCTTAAGACATTCTTAC 
TCTGATCCAGGCAAAAACACTTCAAGGTTTGTAAATCGACTCTTTCCTGACATAAATCCT 
30 TTTTTATTAAAATNGCAAAATTGTTCTTCAGAATAAAA.CTGTGTAATAATTTTTATACTT 
GGGAGTGCTCCTTGCACAGAGCTGTCATTTGCCAGTGAGAGCCTCCGACGGGGCAGGTAC 
TGTGCCAGGGCAGCTCTGAAATTATGGAAATTCTTATCCCCCTGGTTCCTNCGGTGGCCA 
ATGGGTAACCTAATACCAGCCCGCGGGAAGCGCCAATTTCNCCCAAAAGGGGGTAAACCA 
CTGGTNAAACATTA 

35 

N46744 

TTTTTCTTTTGTAAATAACAAACACCACTTTGTTATGAAGACCTTACAAACCTCTTCTTA 
AGACATTCTTACTCTGATCCAGGCAAAAACACTTCAAGGTTTGTAAATGACTCTTTCCTG 
ACATAAATCCTTTTTTATTAAAATGCAAAATGTTCTTCAGAATAAAACTGTGTAATAATT 
40 TTTATANGTGGGGGNGCTC 
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N39597 

ACAAAGAAAAATTTAATATTCGATGAGAGGTTGAACCAGGCTTAAAGCAGACATACTAGG 
AAATGGTGCAGCCTGTAAGAATGCCAGTTTGTAAGTACTGACTTTGGAAAAGATCATCGC 
5 CTCTATCAGACACTTAGGGTCCTGGTCTGGCAATTTTGGCCTGATGTGATGCCACAAGAC 
CCAACAGAGAGAGACACAGAGTCCAGGATAATGTTGACAGTGGTGTAGCCCTTTAGGAGA 
AATGGCGCTCCCTGCGGCTGGTATTAGGTTACCATTGGCACCGAAGAACCAGGAGGATAA 
GAATATCCATAATTTCAGAGCTTGCCCTGGCACAGTACCTGCCCCGTCGGAGGCTCTCAC 
TGGGCAAATGGACAGCTCTGTGCAAGGAGCACTCCCAAGTATAANAATTATTACACAGTT 
1 0 TTATTCTGAAGAACATTTTGCATTTTAATAAAAAANGGA 

BF439267 

TTTTTTTTTTTTTTGGGCCAAAATTTTTAATAGTATACAGACAACCTGTTAATTTTTTTT 
TTXXTTTTTTTTGT ;y^ T ^ CA ^ CACC ^ 

1 5 TAAGACATTCTTACTCTGATCCAGGCAAAAACACTTCAAGGTTTGTAAATGACTTTTTCC 
TGACATAAATCCTTTTTTATTAAAATGCAAAATGTTCTTCAGAATAAAACTGTGTAATAA 
TTTTTATACTTGGGAGTGCTCCTTGCACAGAGCTGTCATTTGCCAGTGAGAGCCTCCGAC 
GGGGCAGGTACTGTGCCAGGGCAGCTCTGAAATTATGGATATTCTTATCCTCCTGGTTCC 
TTCGGTGCCAATGGTAACCTAATACCAGCCGCAGGGAGCGCCATTTCTCCTAAAGGGCTA 

20 CACCACTGTCAACATTATCCTGG 

BF436153 

TTTTTTTTTGGTCCAAAATTTTTAATAGTATACAGACAACCTGTTAATTTTTTTTTTTTT 
TTTTTTTGTAAATAACAAACACCACTTTGTTATGAAGACCTTACAAACCTCTTCTTAAGA 

25 CATTCTTACTCTGATCCAGGCAAAAACACTTCAAGGTTTGTAAATGACTCTTTCCTGACA 
TAAATCCTTTTTTATTAAAATGCAAAATGTTCTTCAGAATAAAACTGTGTAATAATTTTT 
ATACTTGGGAGTGCTCCTTGCACAGAGCTGTCATTTGCCAGTGAGAGCCTCCGACGGGGC 
AGGTACTGTGCCAGGGCAGCTCTGAAATTATGGATATTCTTATCCTCCTGGTTCCTTCGG 
TGCCAATGGTAACCTAATACCAGCCGCAGGGAGCGCCATTTCTCCTAAAGGGCTACACCA 

30 CTGTCAACATTATCCTGGACTC 

BF1 10611 

TTTGTGGAAAGATGATAGGTTTATAGTGACTCAAAATATTTTAGAAAAATTTCTGTAGTG 
TCAAGTTCTTTCAAACTTAAAATTTTAACCCCAGAGGATTTTCGCTGAATAAAATGAGAA 

35 TTGGCTCTATTTCTTCTACTTCTGGATAGCCCGAGTAAAAATACTAATAATTTCTAGATT 
TTAGTGGGGAACTACAATTATTAGGACCCATGGATATTGCTGCAGTTCAAATACAATACA 
GTAATTACAAAATATAGACCATCTCTTTACAAATACAAATTATAGTATATTACAAGTCAT 
GTACAGTAAATCTATAATTTTAAACAAACTAGTGTATCTAAGTTTACCTGGTTGCGAGTG 
CATTATTATTCCAGTTTACAGTTGCCCTTAGCGTGACAGTCAGAAACCGACCATCGGAGT 

40 GATATTCTCTTATGTAAACTGGCGTCACATCACAGAAAACCTTATTTATGA 
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M76558 

gggcgagcgc ctccgtcccc 
gcgcgggcgg cggcggcggg 

5 ggggagcgcc gagcggcccc 
ggaggggaca agccagttct 

atatatacat tggattttat 
agagcttggg tggcgagcgg 

tccccgtccc tccccacccc 

10 cctcttggtg atccccttcc 

acacagtagt cgctcaataa 

cagcatcaac ggcagcagca 

agacttcctc tttctggtga 

ctgtcttggc aagctgcaat 

15 acctctgcac ccccacctgt 

agcaaaaaac agggtaactc 

ctcaataacc ccatccgaag 

tttatattat tggctatttt 

gaagatgatt ctaattcaac 

20 atttttacag tcgagacatt 

gcttatgtta ggaatggatg 

agtgtaattt tggaacaatt 

tctggaggct ttgatgtcaa 

gtgtcaggag tgcccagttt 

25 ctccttcaca tagccctttt 

gaacttttta ttggaaaaat 

gaagaggacc cagctccatg 

acggaatgta ggagtggctg 

gcctttgcca tgcttactgt 

30 tactggatga atgatgctat 

atctttgggt catttttcgt 

aaggaaagag agaaggcaaa 

ctggaggagg atctaaaggg 

g^gaatgagg aagaaggagg 

35 actgagtctg tgaacacaga 

agtctctgtc aagccatctc 

ttcaatcgca gaagatgtag 

gtcctggtgt ttctgaacac 

ttgacacaga ttcaagatat 

40 ctggtaaaaa tgtacagctt 

gattgcttcg tggtgtgtgg 

tctcccctgg ggatctctgt 

aggcactgga cttccctgtg 



ggatgtgagc tccggctgcc 
caccgggcac cgcggcgggc 
ggcggccggg ccggcatcac 
cctttgcagc aaaaaattac 
ttttttaaaa agtttatttt 
ttttttttta aaatcaatta 
cctgctgaag cgagaataag 
ccattccgcc cccgccccaa 
atgttcgtgg atgatgatga 
agcggaccac gcgaacgagg 
aggaccaact tctcagccga 
cgatgctgct agacaggcca 
aggatctctc tcccaaagaa 
gtccaacagc cgacctgccc 
agcctgcatt agtatagtgg 
tgccaattgt gtggccttag 
aaatcataac ttggaaaaag 
tttgaagatt atagcgtatg 
gaatttactg gattttgtta 
aaccaaagaa acagaaggcg 
agccctccgt gcctttcgag 
acaagttgtc ctgaactcca 
ggtattattt gtaatcataa 
gcacaaaaca tgtttttttg 
tgcgttctca gggaatggac 
ggttggcccg aacggaggca 
gtttcagtgc atcaccatgg 
gggatttgaa ttgccctggg 
actaaatctt gtacttggtg 
agcacgggga gatttccaga 
ctacttggat tggatcaccc 
agaggaaggc aaacgaaata 
gaacgtcagc ggtgaaggcg 
aaaatccaaa ctcagccgac 
ggccgccgtg aagtctgtca 
cttaaccatt tcctctgagc 
tgccaacaaa gtcctcttgg 
gggcctccaa gcatatttcg 
tggaatcact gagacgatct 
gtttcggtgt gtgcgcctct 
caacttagtg gcatccttat 



cgcggtcccg agccagcggc 
gggcagacgg gcgggcatgg 
cgcggcgtct ctccgctaga 
atgtatatat tattaagata 
gctccatttt tgaaaaagag 
tccttatttt ctgttatttg 
ggcagggacc gcggctccta 
cgcccagcac agtgccctgc 
tgatgatgat gaaaaaaatg 
caaactatgc aagaggcacc 
atagctccaa gcaaactgtc 
aggctgccca aactatgagc 
aacgtcagca atacgccaag 
gcgccctttt ctgtttatca 
aatggaaacc atttgacata 
ctatttacat cccattccct 
tagaatatgc cttcctgatt 
gattattgct acatcctaat 
tagtaatagt aggattgttt 
ggaaccactc aagcggcaaa 
tgttgcgacc acttcgacta ° 
ttataaaagc catggttccc 
tctatgctat tataggattg 
ctgactcaga tatcgtagct 
gccagtgtac tgccaatggc 
tcaccaactt tgataacttt 
agggctggac agacgtgctc 
tgtattttgt cagtctcgtc 
tattgagcgg agaattctca 
agctccggga gaagcagcag 
aagctgagga catcgatccg 
ctagcatgcc caccagcgag 
agaaccgagg ctgctgtgga 
gctggcgtcg ctggaaccga 
cgttttactg gctggttatc 
actacaatca gccagattgg 
ctctgttcac ctgcgagatg 
tctctctttt caaccggttt 
tggtggaact ggaaatcatg 
taagaatctt caaagtgacc 
taaactccat gaagtccagt 
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gcttcgctgt tgcttctgct ttttctcttc attatcatct tttccttgct tgggatgcag 

ctgtttggcg gcaagtttaa ttttgatgaa acgcaaacca agcggagcac ctttgacaat 

ttccctcaag cacttctcac agtgttccag atcctgacag gcgaagactg gaatgctgtg 

atgtacgatg gcatcatggc ttacgggggc ccatcctctt caggaatgat cgtctgcatc 

5 tacttcatca tcctcttcat ttgtggtaac tatattctac tgaatgtctt cttggccatc 

gctgtagaca atttggctga tgctgaaagt ctgaacactg ctcagaaaga agaagcggaa 

gaaaaggaga ggaaaaagat tgccagaaaa gagagcctag aaaataaaaa gaacaacaaa 

ccagaagtca accagatagc caacagtgac aacaaggtta caattgatga ctatagagaa 

gaggatgaag acaaggaccc ctatccgcct tgcgatgtgc cagtagggga agaggaagag 

10 gaagaggagg aggatgaacc tgaggttcct gccggacccc gtcctcgaag gatctcggag 

ttgaacatga aggaaaaaat tgcccccatc cctgaaggga gcgctttctt cattcttagc 

aagaccaacc cgatccgcgt aggctgccac aagctcatca accaccacat cttcaccaac 

ctcatccttg tcttcatcat gctgagcagt gctgccctgg ccgcagagga ccccatccgc 

agccactcct tccggaacac gatactgggt tactttgact atgccttcac agccatcttt 

15 actgttgaga tcctgttgaa gatgacaact tttggagctt tcctccacaa aggggccttc 

tgcaggaact acttcaattt gctggatatg ctggtggttg gggtgtctct ggtgtcattt 

gggattcaat ccagtgccat ctccgttgtg aagattctga gggtcttaag ggtcctgcgt 

cccctcaggg ccatcaacag agcaaaagga cttaagcacg tggtccagtg cgtcttcgtg 

gccatccgga ccatcggcaa catcatgatc gtcaccaccc tcctgcagtt catgtttgcc 

20 tgtatcgggg tccagttgtt caaggggaag ttctatcgct gtacggatga agccaaaagt 

aaccctgaag aatgcagggg acttttcatc ctctacaagg atggggatgt tgacagtcct 

gtggtccgtg aacggatctg gcaaaacagt gatttcaact tcgacaacgt cctctctgct 

atgatggcgc tcttcacagt ctccacgttt gagggctggc ctgcgttgct gtataaagcc 

atcgactcga atggagagaa catcggccca atctacaacc accgcgtgga gatctccatc 

25 ttcttcatca tctacatcat cattgtagct ttcttcatga tgaacatctt tgtgggcttt 

gtcatcgtta catttcagga acaaggagaa aaagagtata agaactgtga gctggacaaa 

aatcagcgtc agtgtgttga atacgccttg aaagcacgtc ccttgcggag atacatcccc 

aaaaacccct accagtacaa gttctggtac gtggtgaact cttcgccttt cgaatacatg 

atgtttgtcc tcatcatgct caacacactc tgcttggcca tgcagcacta cgagcagtcc 

30 aagatgttca atgatgccat ggacattctg aacatggtct tcaccggggt gttcaccgtc 

gagatggttt tgaaagtcat cgcatttaag cctaaggggt attttagtga cgcctggaac 

acgtttgact ccctcatcgt aatcggcagc attatagacg tggccctcag cgaagcagac 

ccaactgaaa gtgaaaatgt ccctgtccca actgctacac ctgggaactc tgaagagagc 

aatagaatct ccatcacctt tttccgtctt ttccgagtga tgcgattggt gaagcttctc 

35 agcagggggg aaggcatccg gacattgctg tggactttta ttaagttctt tcaggcgctc 

ccgtatgtgg ccctcctcat agccatgctg ttcttcatct atgcggtcat tggcatgcag 

atgtttggga aagttgccat gagagataac aaccagatca ataggaacaa taacttccag 

acgtttcccc aggcggtgct gctgctcttc aggtgtgcaa caggtgaggc ctggcaggag 

atcatgctgg cctgtctccc agggaagctc tgtgaccctg agtcagatta caaccccggg 

40 gaggagcata catgtgggag caactttgcc attgtctatt tcatcagttt ttacatgctc 

tgtgcatttc tgatcatcaa tctgtttgtg gctgtcatca tggataattt cgactatctg 

acccgggact ggtctatttt ggggcctcac catttagatg aattcaaaag aatatggtca 

gaatatgacc ctgaggcaaa gggaaggata aaacaccttg atgtggtcac tctgcttcga 

cgcatccagc ctcccctggg gtttgggaag ttatgtccac acagggtagc gtgcaagaga 

45 ttagttgcca tgaacatgcc tctcaacagt gacgggacag tcatgtttaa tgcaaccctg 
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tttgctttgg ttcgaacggc tcttaagatc aagaccgaag ggaacctgga gcaagctaat 

gaagaacttc gggctgtgat aaagaaaatt tggaagaaaa ccagcatgaa attacttgac 

caagttgtcc ctccagctgg tgatgatgag gtaaccgtgg ggaagttcta tgccactttc 

ctgatacagg actactttag gaaattcaag aaacggaaag aacaaggact ggtgggaaag 

5 taccctgcga agaacaccac aattgcccta caggcgggat taaggacact gcatgacatt 

gggccagaaa tccggcgtgc tatatcgtgt gatttgcaag atgacgagcc tgaggaaaca 

aaacgagaag aagaagatga tgtgttcaaa agaaatggtg ccctgcttgg aaaccatgtc 

aatcatgtta atagtgatag gagagattcc cttcagcaga ccaataccac ccaccgtccc 

ctgcatgtcc aaaggccttc aattccacct gcaagtgata ctgagaaacc gctgtttcct 

10 ccagcaggaa attcggtgtg tcataaccat cataaccata attccatagg aaagcaagtt 

cccacctcaa caaatgccaa tctcaataat gccaatatgt ccaaagctgc ccatggaaag 

cggcccagca ttgggaacct tgagcatgtg tctgaaaatg ggcatcattc ttcccacaag 

catgaccggg agcctcagag aaggtccagt gtgaaaagaa cccgctatta tgaaacttac 

attaggtccg actcaggaga tgaacagctc ccaactattt gccgggaaga cccagagata 

15 catggctatt tcagggaccc ccactgcttg ggggagcagg agtatttcag tagtgaggaa 

tgctacgagg atgacagctc gcccacctgg agcaggcaaa actatggcta ctacagcaga 

tacccaggca gaaacatcga ctctgagagg ccccgaggct accatcatcc ccaaggattc 

ttggaggacg atgactcgcc cgtttgctat gattcacgga gatctccaag gagacgccta 

ctacctccca ccccagcatc ccaccggaga tcctccttca actttgagtg cctgcgccgg 

20 cagagcagcc aggaagaggt cccgtcgtct cccatcttcc cccatcgcac ggccctgcct 

ctgcatctaa tgcagcaaca gatcatggca gttgccggcc tagattcaag taaagcccag 

aagtactcac cgagtcactc gacccggtcg tgggccaccc ctccagcaac ccctccctac 

cgggactgga caccgtgcta cacccccctg atccaagtgg agcagtcaga ggccctggac 

caggtgaacg gcagcctgcc gtccctgcac cgcagctcct ggtacacaga cgagcccgac 

25 atctcctacc ggactttcac accagccagc ctgactgtcc ccagcagctt ccggaacaaa 

aacagcgaca agcagaggag tgcggacagc ttggtggagg cagtcctgat atccgaaggc 

ttgggacgct atgcaaggga cccaaaattt gtgtcagcaa caaaacacga aatcgctgat 

gcctgtgacc tcaccatcga cgagatggag agtgcagcca gcaccctgct taatgggaac 

gtgcgtcccc gagccaacgg ggatgtgggc cccctctcac accggcagga ctatgagcta 

30 caggactttg gtcctggcta cagcgacgaa gagccagacc ctgggaggga tgaggaggac 

ctggcggatg aaatgatatg catcaccacc ttgtagcccc cagcgagggg cagactggct 

ctggcctcag gtggggcgca ggagagccag gggaaaagtg cctcatagtt aggaaagttt 

aggcactagt tgggagtaat attcaattaa ttagactttt gtataagaga tgtcatgcct 

caagaaagcc ataaacctgg taggaacagg tcccaagcgg ttgagcctgg cagagtacca 

35 tgcgctcggc cccagctgca ggaaacagca ggccccgccc tctcacagag gatgggtgag 

gaggccagac ctgccctgcc ccattgtcca gatgggcact gctgtggagt ctgcttctcc 

catgtaccag ggcaccaggc ccacccaact gaaggcatgg cggcggggtg caggggaaag 

ttaaaggtga tgacgatcat cacacctgtg tcgttacctc agccatcggt ctagcatatc 

agtcactggg cccaacatat ccatttttaa accctttccc ccaaatacac tgcgtcctgg 

40 ttcctgttta gctgttctga aatacggtgt gtaagtaagt cagaacccag ctaccagtga 

ttattgcgag ggcaatggga cctcataaat aaggttttct gtgatgtgac gccagtttac 
ataagagaat atcac 

AF088004 
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tttttttttt cttacaaaga aaaatttaat attcgatgag aggttgaacc aggcttaaag 

cagacatact aggaaatggt gcagcctgta agaatgccag tttgtaagta ctgactttgg 

aaaagatcat cgcctctatc agacacttag ggtcctggtc tggcaatttt ggcctgatgt 

gatgccacaa gacccaacag agagagacac agagtccagg ataatgttga cagtggtgta 

5 gccctttagg agaaatggcg ctccctgcgg ctggtattag gttaccattg gcaccgaagg 

aaccaggagg ataagaatat ccataatttc agagctgccc tggcacagta cctgccccgt 

cggaggctct cactggcaaa tgacagctct gtgcaaggag cactcccaag tataaaaatt 

attacacagt tttattctga agaacatttt gcattttaat aaaaaaggat ttatgtcagg 

aaagagtcat ttacaaacct tgaagtgttt ttgcctggat cagagtaaga atgtcttaag 

10 aagaggtttg taaggtcttc ataacaaagt ggtgtttgtt atttacaaaa aaaaaaaaaa 

aaaaaaatta acaggttgtc tgtatactat taaaaat 

M83566 

agaataaggg cagggaccgc ggctcctatc tcttggtgat ccccttcccc attccgcccc 

15 cgcctcaacg cccagcacag tgccctgcac acagtagtcg ctcaataaat gttcgtggat 

gatgatgatg atgatgatga aaaaaatgca gcatcaacgg cagcagcaag cggaccacgc 

gaacgaggca aactatgcaa gaggcaccag acttcctctt tctggtgaag gaccaacttc 

tcagccgaat agctccaagc aaactgtcct gtcttggcaa gctgcaatcg atgctgctag 

acaggccaag gctgcccaaa ctatgagcac ctctgcaccc ccacctgtag gatctctctc 

20 ccaaagaaaa cgtcagcaat acgccaagag caaaaaacag ggtaactcgt ccaacagccg 

acctgcccgc gcccttttct gtttatcact caataacccc atccgaagag cctgcattag 

tatagtggaa tggaaaccat ttgacatatt tatattattg gctatttttg ccaattgtgt 

ggccttagct atttacatcc cattccctga agatgattct aattcaacaa atcataactt 

ggaaaaagta gaatatgcct tcctgattat ttttacagtc gagacatttt tgaagattat 

25 agcgtatgga ttattgctac atcctaatgc ttatgttagg aatggatgga atttactgga 

ttttgttata gtaatagtag gattgtttag tgtaattttg gaacaattaa ccaaagaaac 

agaaggcggg aaccactcaa gcggcaaatc tggaggcttt gatgtcaaag ccctccgtgc 

ctttcgagtg ttgcgaccac ttcgactagt gtcaggggtg cccagtttac aagttgtcct 

gaactccatt ataaaagcca tggttcccct ccttcacata gcccttttgg tattatttgt 

30 aatcataatc tatgctatta taggattgga actttttatt ggaaaaatgc acaaaacatg 

tttttttgct gactcagata tcgtagctga agaggaccca gctccatgtg cgttctcagg 

gaatggacgc cagtgtactg ccaatggcac ggaatgtagg agtggctggg ttggcccgaa 

cggaggcatc accaactttg ataactttgc ctttgccatg cttactgtgt ttcagtgcat 

caccatggag ggctggacag acgtgctcta ctgggtaaat gatgcgatag gatgggaatg 

35 gccatgggtg tattttgtta gtctgatcat ccttggctca tttttcgtcc ttaacctggt 

tcttggtgtc cttagtggag aattctcaaa ggaaagagag aaggcaaaag cacggggaga 

tttccagaag ctccgggaga agcagcagct ggaggaggat ctaaagggct acttggattg 

gatcacccaa gctgaggaca tcgatccgga gaatgaggaa gaaggaggag aggaaggcaa 

acgaaatact agcatgccca ccagcgagac tgagtctgtg aacacagaga acgtcagcgg 

40 tgaaggcgag aaccgaggct gctgtggaag tctctggtgc tggtggagac ggagaggcgc 

ggccaaggcg gggccctctg ggtgtcggcg gtggggtcaa gccatctcaa aatccaaact 

cagccgacgc tggcgtcgct ggaaccgatt caatcgcaga agatgtaggg ccgccgtgaa 

gtctgtcacg ttttactggc tggttatcgt cctggtgttt ctgaacacct taaccatttc 
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ctctgagcac 
cctcttggct 
atatttcgtc 
gacgatcctg 
5 gcgcctctta 
atccttatta 
tatcatcttt 
gcaaaccaag 
cctgacaggc 

10 atcctcttca 
tattctactg 
gaacactgct 
gagcctagaa 
caaggttaca 

15 cgatgtgcca 
cggaccccgt 
tgaagggagc 
gctcatcaac 
tgccctggcc 

20 ctttgactat 
tggagctttc 
ggtggttggg 
gattctgagg 
taagcacgtg 

25 cactaccctc 
ctatcgctgt 
ctacaaggat 
tttcaacttc 
gggctggcct 

30 ctacaaccac 
cttcatgatg 
agagtataag 
agcacgtccc 
ggtgaactct 

35 cttggccatg 
catggtcttc 
taaggggtat 
tatagacgtg 
tgctacacct 

40 ccgagtgatg 
gacttttatt 
cttcatctat 
ccagatcaat 
gtgtgcaaca 

45 tgaccctgag 



tacaatcagc 
ctgttcacct 
tctcttttca 
gtggaactgg 
agaatcttca 
aactccatga 
tccttgcttg 
cggagcacct 
gaagactgga 
ggaatgatcg 
aatgtcttct 
cagaaagaag 
aataaaaaga 
attgatgact 
gtaggggaag 
cctcgaagga 
gctttcttca 
caccacatct 
gcagaggacc 
gccttcacag 
ctccacaaag 
gtgtctctgg 
gtcttaaggg 
gtccagtgcg 
ctgcagttca 
acggatgaag 
ggggatgttg 
gacaacgtcc 
gcgttgctgt 
cgcgtggaga 
aacatctttg 
aactgtgagc 
ttgcggagat 
tcgcctttcg 
cagcactacg 
accggggtgt 
tttagtgacg 
gccctcagcg 
gggaactctg 
cgattggtga 
aagtcctttc 
gcggtcattg 
aggaacaata 
ggtgaggcct 
tcagattaca 



cagattggtt 
gcgagatgct 
accggtttga 
aaatcatgtc 
aagtgaccag 
agtccatcgc 
ggatgcagct 
ttgacaattt 
atgctgtgat 
tctgcatcta 
tggccatcgc 
aagcggaaga 
acaacaaacc 
atagagaaga 
aggaagagga 
tctcggagtt 
ttcttagcaa 
tcaccaacct 
ccatccgcag 
ccatctttac 
gggccttctg 
tgtcatttgg 
tcctgcgtcc 
tcttcgtggc 
tgtttgcctg 
ccaaaagtaa 
acagtcctgt 
tctctgctat 
ataaagccat 
tctccatctt 
tgggctttgt 
tggacaaaaa 
acatccccaa 
aatacatgat 
agcagtccaa 
tcaccgtcga 
cctggaacac 
aagcggaccc 
aagagagcaa 
agcttctcag 
aggcgctccc 
gcatgcagat 
acttccagac 
ggcaggagat 
accccgggga 



gacacagatt 
ggtaaaaatg 
ttgcttcgtg 
tcccctgggg 
gcactggact 
ttcgctgttg 
gtttggcggc 
ccctcaagca 
gtacgatggc 
cttcatcatc 
tgtagacaat 
aaaggagagg 
agaagtcaac 
ggatgaagac 
agaggaggag 
gaacatgaag 
gaccaacccg 
catccttgtc 
ccactccttc 
tgttgagatc 
caggaactac 
gattcaatcc 
cctcagggcc 
catccggacc 
tatcggggtc 
ccctgaagaa 
ggtccgtgaa 
gatggcgctc 
cgactcgaat 
cttcatcatc 
catcgttaca 
tcagcgtcag 
aaacccctac 
gtttgtcctc 
gatgttcaat 
gatggttttg 
gtttgactcc 
aactgaaagt 
tagaatctcc 
caggggggaa 
gtatgtggcc 
gtttgggaaa 
gtttccccag 
catgctggcc 
ggagtataca 



caagatattg 
tacagcttgg 
gtgtgtggtg 
atctctgtgt 
tccctgagca 
cttctgcttt 
aagtttaatt 
cttctcacag 
atcatggctt 
ctcttcattt 
ttggctgatg 
aaaaagattg 
cagatagcca 
aaggacccct 
gatgaacctg 
gaaaaaattg 
atccgcgtag 
ttcatcatgc 
cggaacacga 
ctgttgaaga 
ttcaatttgc 
agtgccatct 
atcaacagag 
atcggcaaca 
cagttgttca 
tgcaggggac 
cggatctggc 
ttcacagtct 
ggagagaaca 
tacatcatca 
tttcaggaac 
tgtgttgaat 
cagtacaagt 
atcatgctca 
gatgccatgg 
aaagtcatcg 
ctcatcgtaa 
gaaaatgtcc 
atcacctttt 
ggcatccgga 
ctcctcatag 
gttgccatga 
gcggtgctgc 
tgtctcccag 
tgtgggagca 



ccaacaaagt 
gcctccaagc 
gaatcactga 
ttcggtgtgt 
acttagtggc 
ttctcttcat 
ttgatgaaac 
tgttccagat 
acgggggccc 
gtggtaacta 
ctgaaagtct 
ccagaaaaga 
acagtgacaa 
atccgccttg 
aggttcctgc 
cccccatccc 
gctgccacaa 
tgagcagcgc 
tactgggtta 
tgacaacttt 
tggatatgct 
ccgttgtgaa 
caaaaggact 
tcatgatcgt 
aggggaagtt 
ttttcatcct 
aaaacagtga 
ccacgtttga 
tcggcccaat 
ttgtagcttt 
aaggagaaaa 
acgccttgaa 
tctggtacgt 
acacactctg 
acattctgaa 
catttaagcc 
tcggcagcat 
ctgtcccaac 
tccgtctttt 
cattgctgtg 
ccatgctgtt 
gagataacaa 
tgctcttcag 
ggaagctctg 
actttgccat 



139 



PATENT 

Atty. Dkt. No. 022041001420 



tgtctatttc atcagttttt acatgctctg tgcatttctg atcatcaatc tgtttgtggc 

tgtcatcatg gataatttcg actatctgac ccgggactgg tctattttgg ggcctcacca 

tttagatgaa ttcaaaagaa tatggtcaga atatgaccct gaggcaaagg gaaggataaa 

acaccttgat gtggtcactc tgcttcgacg catccagcct cccctggggt ttgggaagtt 

5 atgtccacac agggtagcgt gcaagagatt agttgccatg aacatgcctc tcaacagtga 

cgggacagtc atgtttaatg caaccctgtt tgctttggtt cgaacggctc ttaagatcaa 

gaccgaaggg aacctggagc aagctaatga agaacttcgg gctgtgataa agaaaatttg 

gaagaaaacc agcatgaaat tacttgacca agttgtccct ccagctggtg atgatgaggt 

aaccgtgggg aagttctatg ccactttcct gatacaggac tactttagga aattcaagaa 

10 acggaaagaa caaggactgg tgggaaagta ccctgcgaag aacaccacaa ttgccctaca 

ggcgggatta aggacactgc atgacattgg gccagaaatc cggcgtgcta tatcgtgtga 

tttgcaagat gacgagcctg aggaaacaaa acgagaagaa gaagatgatg tgttcaaaag 

aaatggtgcc ctgcttggaa accatgtcaa tcatgttaat agtgatagga gagattccct 

tcagcagacc aataccaccc accgtcccct gcatgtccaa aggccttcaa ttccacctgc 

15 aagtgatact gagaaaccgc tgtttcctcc agcaggaaat tcggtgtgtc ataaccatca 

taaccataat tccataggaa agcaagttcc cacctcaaca aatgccaatc tcaataatgc 

caatatgtcc aaagctgccc atggaaagcg gcccagcatt gggaaccttg agcatgtgtc 

tgaaaatggg catcattctt cccacaagca tgaccgggag cctcagagaa ggtccagtgt 

gaaaagaacc cgctattatg aaacttacat taggtccgac tcaggagatg aacagctccc 

20 aactatttgc cgggaagacc cagagataca tggctatttc agggaccccc actgcttggg 

ggagcaggag tatttcagta gtgaggaatg ctacgaggat gacagctcgc ccacctggag 

caggcaaaac tatggctact acagcagata cccaggcaga aacatcgact ctgagaggcc 

ccgaggctac catcatcccc aaggattctt ggaggacgat gactcgcccg tttgctatga 

ttcacggaga tctccaagga gacgcctact acctcccacc ccagcatccc accggagatc 

25 ctccttcaac tttgagtgcc tgcgccggca gagcagccag gaagaggtcc cgtcgtctcc 

catcttcccc catcgcacgg ccctgcctct gcatctaatg cagcaacaga tcatggcagt 

tgccggccta gattcaagta aagcccagaa gtactcaccg agtcactcga cccggtcgtg 

ggccacccct ccagcaaccc ctccctaccg ggactggaca ccgtgctaca cccccctgat 

ccaagtggag cagtcagagg ccctggacca ggtgaacggc agcctgccgt ccctgcaccg 

30 cagctcctgg tacacagacg agcccgacat ctcctaccgg actttcacac cagccagcct 

gactgtcccc agcagcttcc ggaacaaaaa cagcgacaag cagaggagtg cggacagctt 

ggtggaggca gtcctgatat ccgaaggctt gggacgctat gcaagggacc caaaatttgt 

gtcagcaaca aaacacgaaa tcgctgatgc ctgtgacctc accatcgacg agatggagag 

tgcagccagc accctgctta atgggaacgt gcgtccccga gccaacgggg atgtgggccc 

35 cctctcacac cggcaggact atgagctaca ggactttggt cctggctaca gcgacgaaga 

gccagaccct gggagggatg aggaggacct ggcggatgaa atgatatgca tcaccacctt 

gtagccccca gcgaggggca gactggctct ggcctcaggt ggggcgcagg agagccaggg 

gaaaagtgcc tcatagttag gaaagtttag gcactagttg ggagtaatat tcaattaatt 

agacttttgt ataagagatg tcatgcctca agaaagccat aaacctggta ggaacaggtc 

40 ccaagcggtt gagcctggca gagtaccatg cgctcggccc cagctgcagg aaacagcagg 

ccccgccctc tcacagagga tgggtgagga ggccagacct gccctgcccc attgtccaga 

tgggcactgc tgtggagtct gcttctccca tgtaccaggg caccaggccc acccaactga 

aggcatggcg gcggggtgca ggggaaagtt aaaggtgatg acgatcatca cacctcgtgt 

cgttacctca gccatcggtc tagcatatca gtcactgggc ccaacatatc catttttaaa 

45 ccctttcccc caaatacact gcgtcctggt tcctgtttag ctgttctgaa ata 
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CB410657 

i 

j GTACTGTGCCGGGGCAGCTCTGAAATTATGGATATTCTTATCCTCCTGGTTCCTTCGGTG 
CCAATGGTAACCTAATACCAGCCGCAGGGAGCGCCATTTCTCCTAAAGGGCTACACCACT 
5 GTCAACATTATCCTGGACTCTGTGTCTCTCTCTGTTGGGTCTTGTGGCATCACATCAGGC 
CAAAATTGCCAGACCAGGACCCTAAGTGTCTGATAGAGGCGATGATCTTTTCAAAGTCAG 
TAC 

BQ372430 . 

1 0 TGCAGCAANTGGCACGGAATGTAGGAGTGGGTGGGTGGGACCGAACGGAGGCATCACCAA 
CTTTGATAACTTGGCCTATGCCATGCTTACGGTGTTTCAGTGCATCACCATGGAGGGCTG 
GACAGATGTGCTCTACTGGGTAAATGATGCGATAGGATGGGAATGGCCATGGGCGTATTT 
TGTTAGTCTGATCATCCTTGGCTCATTTTTCGTCCTTAACCTGGTTCTTGGTGTCCTTAG 
TGGAGAATTCTCAAAGGAAAGAGAGAAGGCAAAAGCACGGGGAGATTTCCAGAAGCTCCG 

1 5 GGAGAAGCAGCAGCTGGAGGAGGATCTAAAGGGCTACTTGG 

BQ366601 

ATGACTACGGGGGAAGTTCATTCTGACCTTCCAGACTAGCTAGTACTATATGAAATCCGA 
GAGACGGAATGAACACGGACTGATGGGAAAGTACCCTGCGAAGAACACCACAATTGCCCT 
20 ACAGGCGTGATTAAGGACACTGCATGATAGTTGCTCCAGAATGCCGGCGTGCTATATCGT 
GTGATTTGCAAGATGACGAGCGTGAGGAAACAAAACGAGAAGAAGAAGATGATGTGTTCA 
AAAGAAATGGTGCCCTGCTTGGAAACCATGTCAATCATGTTAATAGTGATAGGAGAGATT 
CCCTTCAGCAGACCAATACCACCCACCGTCCNCTGCATGTCCAAAGGCCTTCAATTCCAC 
CTGCAAGTGATACTGAGAAACCGCTGTTCCTCCAGCAGGAAATTCG 

25 

BQ324528 

TACATCTCCGCTATCTGTGCCGTGTAACACGGTGTCCAGTCTCGTTAGGGAGGGGCTGCT 
GGAGGGGTGGCCCACGACCGGGTCGAGTGACTCGGTGAGCACTTCTGTGCTTTACTTGAA 
TCTAGGCCGGCAACTGCCATGATCTGTTGCTGCATTAGATGCAGAGGCAGTGCCGCGCGA 
30 TGGTGAAGATGGGAGACGACGGGACCTCTTGCTGGCTGCTCTGCCGGCGCAGGCAC 

BQ318830 

TGTCGTGACTGGCGATACCTGGCGTTAGTGTGTACATGGTGTTCATAATTGCTGCTGCAT 
AACATTTTGTGAGAATTAATGTGACAATGTATGTGCAGTGCTTAGCACATAGCAAGTGCT 
35 CATGAATGGTAGCCACCAAGATGGCTGTTGTCATTTTAGTTTGCAGCAGTTCCACTTGTC 
ATCATTGAGTTCCCAGGGAGTCCCCTCTTCTTTGGGAACAGACTTGCTCTCTGTAGCTCC 
ATTGCGGTAAAAACAGATGAGGTTAATCCCTGTCCCAATCATTTTGGAGATGGCGTCGTT 
TGTATTCCAATTCCACAGCCCAGTTCTTGTCTTTGTCTTCCTTTTATTTAAGCAGCAGCC 
ACACAGAATTAGCCCTTTTCAAAAATAAATAAGATTATCATCCTGTTTTGCGTCCCTGGG 
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GTAACAGACTCTAACATTTCTTTCTCTTTCTCTTCTTTCAGATTGTCTAGTGTAATTTTG 
GAACAATTAACCAAAGAAACAGAAGGCGGGAACCACTCACGCGGCAAATCTGGAGGCTTT 
GATGTCAAAGCCCTCCGTGCCTTTCGAGTGTTGCGACCACTTCGAA 

5 AL708030 

AGTTCCCACCTCAACAAATGCCAATCTCAATAATGCCAATATGTCCAAAGCTGCCCATGG 
AAAGCGGCCCAGCATTGGGAACCTTGAGCATGTGTCTGAAAATGGGCATCATTCTTCCCA 
CAAGCATGACCGGGAGCCTCAGAGAAGGTCCAGTGTGAAAAGGTCCGACTCAGGAGATGA 
ACAGCTCCCAACTATTTGCCGGGAAGACCCAGAGATACATGGCTATTTCAGGGACCCCCA 

1 0 CTGCTTGGGGGAGCAGGAGTATTTCAGTAGTGAGGAATGCTACGAGGATGACAGCTCGCC 
CACCTGGAGCAGGCAAAACTATGGCTACTACAGCAGATACCCAGGCAGAAACATCGACTC 
TGAGAGGCCCCGAGGCTACCATCATCCCCAAGGATTCTTGGAGGACGATGACTCGCCCGT 
TTGCTATGATTCACGGAGATCTCCAAGGAGACGCCTACTACCTCCCACCCCAGCATGTGA 
GGCCAGATTTTTTGTTTTTGGGTGGAACCTCCCGGGGAACAGTGTACCTTTCCCCCAACC 

15 CCCGCTCTG 

BM509161 

ATTCGGCACGAGCCTCCTTCAACTTTGAGTGCTCTGCCCCTTGGGTATCCATAGTTACGG 
TTTTCTCTGTGGCCCACCCAGGGTGTTTTTTGCATCGCTGGTGCAGAAATGCACAGGTGG 

20 ATGAGATATAGCTGCTCTTGTCCTCTGGGGACTGGTGGTGCTGCTTAAGAAATAAGGGGT 
GCTGGGGACAGAGGAGCAACGTGGTGATCTATAGGATTGGAGTGTCGGGGTCTGTACAAA 
TCGTATTGTTGCCTTTTACAAAACTGCTGTACTGTATGTTCTCTTTGAGGGCTTTTATAT 
GCAATTGACTGAGGGCTGAAGTTTTCATTAGAATGCACTCACACTCTGACTGTACGTCCT 
GATGAAAACCCACTTTTGGATAATTAGAACCGTCAAGGCTTCATTTTCTGTCAACAGAAT 

25 TAGGCCGACTGTCAGGTTACCTTGGCAGGGATTCCCTGCAATCAAAAAGATAGATGATAG 
GTAGCAATTTTGGTCCAAAATTTTTAATAGTATACAGACAACCTGTTAATTTTTTTTTTT 
TTTTTTTTTGTAAATAACAAACACCACTTTGTTATGAAGACCTTACAAACCTCTT 

N85902 

30 GGAAAACTCAAGTCCAGAGCAATACTACGTAAAATTCAGAAGTGAGAACATACAAAGGCA 
ACACACAGGCTGACGAAGAAACAGAAAGAAGATACTGACCTGAGTTTGGATTTTGAGATG 
GCTTGACTGAAAGAAAGACAAAAAGTGTTAAGATTCTGGTTCCGAGGGCTTGAGCACACA 
CTCCCCATCATTTCAGCTGGAGATTTCAT 

35 BQ774355 

TTTTTTTTTTTTTTTTTTATTCTGAAGAACATTTTGCATTTTAATAAAAAAGGATTTATG 
TCAGGAAAGAGTCATTTACAAACCTTGAAGTGTTTTTGCCTGGATCAGAGTAAGAATGTC 
TTAAGAAGAGGTTTGTAAGGTCTTCATAACAAAGTGGTGTTTGTTATTTACAAAAAAAAA 
AAAAAAAAATTAACAGGTTGTCTGTATACTATTAAAAATTTTGGACCAAAATTGCTACCT 
40 ATCATCTATCTTTTTGATTGCAGGGAATCCCTGCCAAGGTAACTTGACAGTCGGCCTAAT 
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TCTGTTGACAGAAAATGAAGCCTTGACGGTTCTAATTATCCAAAAGTGGGTTTTCATCAG 
GACGTACAGTCAGAGTGTGAGTGCATTCTAATGAAAACTTCTTCAGCCCTCATTCAATTG 
CATACAAAAGCCCTCAAAGAGAACATACAGTACAGCAGTTTTGTAAAAGGCAACAATACG 
ATTTGTACAGACCCCGACACTCCAATCCTATAGATCACCACGTTGCTCCTCTGTCCCCAG 
5 CACCCCTTATTTCTTAAGCAGCACCACCAGTCCCCAGAGGACAAGAGCAGCTATATCTCA 
TCCACCTGTGCATTTCTGCACCAGCGATGCANAAAACACCCTGGGGTGGGCCACAGAGAA 
AACCGTAACTATGGATACCCAAGGGGC 

CA774243 

10 TAAATAACAAACACCACTTTGTTATGAAGACCTTACAAACCTCTTCTTAAGACATTCTTA 
CTCTGATCCAGGCAAAAACACTTCAAGGTTTGTAAATGACTCTTTCCTGACATAAATCCT 
TTTTTATTAAAATGCAAAATGTTCTTCAGAATAAAACTGTGTAATAATTTTTATACTTGG 
GAGTGCTCCTTGCACAGAGCTGTCATTTGCCAGTGAGAGCCTCCGACGGGGCAGGTACTG 
TGCCAGGGCAGCTCTGAAATTATGGATATTCTTATCCTCCTGGTTCCTTCGGTGCCAATG 

1 5 GTAACCTAATACCAGCCGCAGGGAGCGCCATTTCTCCTAAAGGGCTACACCACTGTCAAC 
ATTATCCTGGACTCTGTGTCTCTCTCTGTTGGGTCTTGTGGCATCACATCAGGCCAAAAT 
TGCCAGACCAGGACCCTAAGTGTCTGATAGAGGCGATGATCTTTTCCAAAGTCAGTACTT 
ACAAACTGGCATTCTTACAGGCTGCACCATTTCCTAGTATGTCTGCTTTAAGCCTGGTTC 
AACCTCTCATCGAATATTAAATTTTTCTTTGTA 

20 

CA436347 

TTTTTTTTTTTTTTTCTTGGGGAAAGATGATAGGTTTATAGTGACTCAAAATATTTTAAA 
AAAATTTCTGTAGGGTCAAGTTCTTTCAAACTTAAAATTTTAACCCCAGAGGATTTTCGC 
TGAATAAATGAAAATTGGCTCTATTTCTTCAACTTCGGGATAGCCCGAGTAAAAATACTA 
25 ATAATTTCTAAATTTTAGGGGGGAACTACAATTATTAGGACCCATGGATATTGCTGCAGT 
TCAAATACAATACAGTAATTACAAAATATAGACCATCTCTTTACAAATACAAATTATAGT 
ATATTAC AAGTCATGTACAGTAAAT CTATAATTTTAAA.CAAAC TAGTGTATC TAAGTTTA 
CCTGGTTGCGAGTGCATTATTATTCCAGTTTACAGTTGCCCTTAGCGTGACAGTCAGAAA 
CCGACCATCGGAGTGATATTCTCTTATGTAAAC 

30 

CA38901 1 

TGATTACTTGTAGCAAAGTACTTCCCCACATTTAGCTGGATTTGTCTTTGGTTTGAAGAG 
GCTAATACGTGAAAGATTTGTTCACAGTTGGATGTCCCCTTTTCTGAACCATGAAGTAAT 
ATTGTGAATGGAGTTGAATGCTGAGGTTAGGGTGCCGGAAAGATTCAGGGTCCTTCGGTA 
35 CCCTCACATGGCTTGGCTTTGGTAGAACAAGAAACTAAGCTCTGATTTGGCTTTAAATGA 
GAGTGCTAAATTTCCTTTTTCTAATAAAGAACCTAGCTAAACATTTATATATACTTTTGA 
ACACTGAACTNTCTTGTTGCAGAGTTAACAGCTGTTGGGGGTAGCTGACAGCTGGATCCT 
GGTGCTGTTGGTACCATGGTACCTGAAGTGCACAGGCTGGTAGCCACACCTGACA 

40 BU679327 
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TTTTTTTTTTTTTTTCTTACAAAGAAAAATTTAATATTCGATNGAGAGGTTGAACCAGGC 
TTAAAGCAGACATACTAGGAAATGGTGCAGCCTGTAAGAATGCCAGTTTGTAAGTACTGA 
CTTTGGAAAAGATCATCGCCTCTATCAGACACTTAGGGTCCTGGTCTGGCAATTTTGGCC 
TGATGTGATGCCACAAGACCCAACAGAGAGAGACACAGAGTCCAGGATAATGTTGACAGT 
5 GGTGTAGCCCTTTAGGAGAAATGGCGCTCCCTGCGGCTGGTATTAGGTTACCATTGGCAC 
CGAAGGAACCAGGAGGATAAGAATATCCATAATTTCAGAGCTGCCCTGGCACAGTACCTG 
CCCCGTCGGAGGCTCTCACTGGCAAATGACAGCTCTGTGCAAGGAGCACTCCCAAGTATA 
AAAATTATTACACAGTTTTATTCTGAAGAACATTTTGCATTTTAATAAAAAAGGATTTAT 
GTCAGGAAAGAGTCATTTACAAACCTTGAAGTGTTTTTGCCTGGATCAGAGTAAGAATGT 
1 0 CTTAAGAAGAGGTTTGTAAGGTCTTCATAACANAGTGGTGTTTGTTATTTACAAAAAAAA 
AAAAAAAAAAAATAAAAAAAAAAAAAAAAACCTCGTGCCGAATTCT 



BU608029 

TTTTTTTTTTTTTTTTGTAAATAACAAACACCACTTTGGTTATGAAGACCTTACAAACCT 
1 5 CTTCTTAAGACATTCTTACTCTGATCCAGGCAAAAACACTTCAAGGTTTGTAAATGACTC 
TTTCCTGACATAAATCCTTTTTTATTAAAATGCAAAATGTTCTTCAGAATAAAACTGTGT 
AATAATTTTTATACTTGGGAGTGCTCCTTGCACAGAGCTGTCATTTGCCAGTGAGAGCCT 
CCGACAGGGCAGGTACTGTGCCAGGGCAGCTCTGAAATTATGGATATTCTTATCCTCCTG 
GTTCCTTCGGTGCCAATGGTAACCTAATACCAGCCGCAGGGAGCGCCATTTCTCCTAAAG 
20 GGCTACACCACTGTCAACATTATCCTGGACTCTGTGTCTCTCTCTGTTGGGTCTTGTGGC 
ATCACATCAGGCCAAAATTGCCAGACCAGGACCCTAAGTGTCTGATAGAGGCGATGATCT 
TTTCCAAAGTCAGTACTTACAAACTGGCATTCTTACAGGCTGCACCATTTCCTAGTATGT 
CTGCTTTAAGCCTGGTTCAACCTCTCATCGAATATTAAATTTTTCTTTGTAAGAAAAATT 
TGAAGTTGTAGAGCATGGTTTTTTGTTTTCCCTTGTCTTAGGAAAGTTTTAAGATGAAAT 
25 GTTTTTCC 

BU073743 

AGTACACAAGGTGAAACTGCTCCAGTTTTTCTCATAGCAGGGTCAGCAGGAAAGCAAGTG 
GTGCCCCTGGTCCCATCTCACACAGGTGAGACTGCACCGAGAGGTAACGTGGCCCTCACA 

30 GCCCACCACGCCTGGCCTTCGCCCAATTCTGAAACTTCGTAGGATAGAGCTGGAAAGTGC 
CACATGGTGAAGCGAGATCCAGCTGTCTGGGTGGATGTCGGAGTCCATAGGCTGAGCAGA 
GATGGTTCTTAGTGAGGTTCTCGCTGCCAGTTGACGGTGAAATCATAGCTGCCATTTACA 
TTTTGTGAGATTATGAAAAACATAAGACTAAAGAAACTAAATGTGTTATTCCTGTGGACA 
CAAAAATGTGTGTTTTTCAGATGGGGAGGGGACCAAAAAGGAAAAACATTTCATCTTAAA 

35 ACTTCCCTAAGACAAAGGAAAACAAAAAACCATGCTCTACAACTTCAAATTTTTCTTACA 
AAGAAAAATTTAATAT 

BE175413 

AGCTGAGGAAACAAAACGAGAGAAGAAGATGATGTGTTCAAAAGAAATGGTGCCCTGCTT 
40 GGAAACCATGTCAATCATGTTAATAGTGATAGGAGAGATTCCCTTCAGCAGACCAATACC 
ACCCACCGTCCCCTGCATGTCCAAAGGCCTTCAATTCCACCTGCAAGTGATACTGAGAAA 
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CCGCTGTTTCCTCCAGCAGGAAATTCGGTGTGTCATAACCATCATAACCATAATTCCATA 
GGAAAGCAAGTTCCCACCTCAACAAATGCCAATCTCAATAATGCCAATATGTCCAAAGCT 
GCCCATGGAAAGCGGCCCAGCATAGGGAACCTTGAGCATGTGTCTGAAAATGGGCATCAT 
TCTTCCCACAAGCATGACCGGGAGCCTCAGAGAAGGTCCAGTGTGAAAAGGTCCGACTCA 
5 GGAGATGAACAGCTCCCAACTATTGGCCGGGAAGACCCAGAGATACATGGCTATTTCAGG 
CACCCCCACGGCTTGGGGGAGCAGGAGTATTTCAGTAGTGAGGAATGCTACGAGGATGAC 
AGCTCGCCCACCTGGAGCAGGCAAAACTATGGCTACTACAGCAGATACCCAGGCAGAAAC 
ATCGACTCTGAGAGGCGCGAGGCTACATCATCCCAAGATTCTGGAGGAGATGACTCGCCG 
TTTGTATGATCACGAGATCTCAAGAGAGCTATACTCCCACC 

10 

AW969248 

TCTTGTGGAAAGATGATAGGTTTATAGTGACTCAAAATATTTTAGAAAAATTTCTGTAGG 
GTCAAGTTCTTTCAAACTTAAAATTTTAACCCCAGAGGATTTTCGCTGAATAAATGAAAA 
TTGGCTCTATTTCTTCTACTTCTGGATAGCCCGAGTAAAAATACTAATAATTTCTAGATT 

1 5 TTAGTGGGGAACTACAATTATTAGGACCCATGGATATTGCTGCAGTTCAAATACAATACA 
GTAATTACAAAATATAGACCATCTCTTTACAAATCCAAATTATAGTATATTACAAGTCAT 
GTACCGTAAATCTATTTTAAACAAACTAGGGTATCTAAGTTTACCTGGTTGCAAGTGCAT 
TATTATTCCAGTTTACAGTTGCCCTTAGCGTGACAGTCAGAAACCGACCATCGGAGTGAT 
ATTCTCTTATGTAAACTGGCGTCACATCACAGAAAACCTTATTTATTTGGGGGAAAGGGT 

20 TTAAAAATGGATATGTTGGGCCCAGTGACTGATAC 

AI90811 

GGAAAAGATCATCGCCTCTATCAGACACTTAGGGTCCTGGTCTGGCAATTTTGGCCTGAT 
GTGATGCCACAAGACCCAACAGAGAGAGACACAGAGTCCAGGATAATGTTGACAGTGGTG 
25 TAGCCCTTTAGGAGAAATGGCGCTCCCTGCGGCTGGTATTAGGTTACCATTGGCACCGAA 
GGAACCAGGAGGATAAGAATATCCATAATTTCAGAGCTGCCCTGGCACGGTACCTGCCCC 
GTCGGAGGCTCTCACTGG 

BF754485 

30 GATGCGTGATGGCTGATCTAGAGGTATCCCATGGACTCTCATCGCAGCTCCTGGTACACA 
GACGAGCCCGACATCTCCTACCGGACTTTCACACCAGCCAGCCTGACTGTCCCCAGCAGC 
TTCCGGAACAAAAACAGCGACAAGCAGAGGAGTGCGGACAGCTTGGTGGAGGCAGTCCTG 
ATATCCGAAGGCTTGGGACGCTATGCAAGGGACCCAAAATTTGTGTCAGCAACAAAACAC 
GAGATCGCTGATGCCTGTGACCTCACCATCGACGAGATGGAGAGTGCAGCCAGCACCCTG 

35 CTTAATGGGAACGTGCGTCCCCGAGCCAACGGGGATGTGGGCCCCCTCTCACACCGGCAG 
GACTATGAGCTACAGGACTTTGGTCCTGGCTACAGCGACGAAGGGCCAGACCCTGGGAGG 
GATGAGGAGGACCTGGCGGATGAAATGATATGCATCACCACCTTGTAGCCCCCAGCGAGG 
GGCAGACTGGCTCTGGCCTCAGGTGGGGCG 

40 BI015409 
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CGCTCGTTCGCTGTGCCAGGACAAAGTCCTGTAGCTCATAGTCCTGCCGTGTGAGAGGGG 
GCCACATCCCCGTTNCTCGGGACGCACGACCCATTAAGCAGGGTGCTGGCTGCCCCCTCC 
ATCTCGTCGATGGAGAGGTCANCAGGCATCAGCGATTTCGTGTTTTGTGTGCGTGACACA 
AATTTTGGGTCCCTTGCATACGCGTCCCACAGCCTTACGGAGTATCAGCGACTGCTCTCC 
5 ACCAATGCTGCCCGCGACTCCTACTGCTTGTCCGCTGTTTTTGGTTCCGGAAGCTGCTGG 
GGACAGTCAGGCTGGCTGGTGTGAAAGTCCGGTAGGAGATGTCGGGCTCGTCTGTGTACC 
AGGAGCTGCGGTGCAGGGACGGCAGGCTGCCGTTCACCTGGTCCG 

BG202552 

GAGTTTCGAGCTTCTCTTTTCCTAAGNGAAAAAANAAAGAANCACAAGNAAACCAAATAA 
CCATGTTACTCTGTATAAAAATGCTAATCAGGGAATTCTGAATCAATAATGCTCCAATGA 
AGGACAGAATTTAATTAGAAACAACACTAACCACAAGAGCCTAGCACAACCCAAACTCAG 
AGCTTCCTGGTAATCTCAATGCGATGGATTCATTACACAGACCATCTTATTAAAATTCTC 
ATCTGAGAGCTAATCAGCATTGAATGCATCATTTATTTTATGACACCAAAATTAACTGCA 
GTGATTCTTTAAGCATGGGGACACGTGACTCCCACTCTCAGCCCCGAGGGATGACAGCCA 
AGAGCCTGGCTTCTGCCCAAGATTCCATCCGTTTTGGTCTGCAGTGCATGGTCAACCATG 
ATCCACAAAGCAGCAACCCGGGGGCTGTAGCTGCCGTGATGCGGGGGTAAGCCTGGCAGG 
CTGCAACTGTTGCAGGGCTCCCAACACAGCCCCTGGACAAACGCGTCAGGGGAAAATAGG 
GTTACCTGGCAATCTTTTTCCTCTCCTTTTCTTCCGCTTCTTCTTTCTGAGCAGTGTTCA 
GACTTTCAGCATCAGCCAAAGTGTCTACAGCGATGGCCAAGAAGACATTCAGTAGAATAT 
CTAATTACAACTTTTTAAGGGCACAACACACTACTAAATGCAACTACGTGCGGCCAACAA 
TGGCAACGCCACACACCTCTGCATCCCGGGAAGCTGGGTAGTAGGTGACGTCCCCAAGTG 
TTATACTCACACAGCAAACCTAGAGTACCAGAGCCCTGCTTTTCAAACAANACANAACAA 
ACAAACAACCCAAAGTAAAACCTGGTAAGGGACGTCTTCAGAAGTAAATTAC 

BF883669 

CTGGCTTTCCCATAGCACGCTCGGCAGGAAAGCAAGTGATGCCCCTGGCTCCCATCTCAC 
ACAGGTGACACTGCACCGAGAGGTAACGTGGCCCTCACAGCCCACCACGCCTGGCCTTCG 
CCCAATTCTGAAACTTCGTAGGATAGAGCTGGAAAGTGGCACATGGTGAAGCGAGATCCA 
30 GCTGTCTGGGTGGATGTCGGAGCTCCATAGGCTGAGCAGAGATGGTTCTTAGTGAGGTTC 
TCGCTGCCAGTTGACGGTGAAATCATAGCTGCCATTTACATTTTGTGAGATTATGAAAAA 
CATAAGACTAAAGAAACTAAATGTGTTATTCCTGTGGACACAAAAATGTGTGTTTTTCAG 
ATGGGGAGGGGACCAAAAAGGAAAAACATTTCATCTTAAAACTTTCCTAAGACAAAGGAA 
AACAA 

35 

BF817590 

CTCAGCATGNATGAAACAGGATGAGGTTGGTGAAGATGTGGTGGTTGATGAGCTTGTGGC 
AGCCTACGCGGATCGGGTTGGTCTTGCTAAGAATGAAGAAAGCGCTCCCTTCAGGGATGG 
GGGCAATTTTTTCCTTCATGTTCAACTCCGAGATCCTTCGAGGACGGGGTCCGGCAGGAA 
40 CCTCAGGTTCATCCTCCTCCTCTTCCTCTTCCTCTTCCCCTACGGGCACATCGCAAGGCG 
GATAGGGGTCCTTGTCTTCATCCTCTTCTCTATAGTCATCAATTGTAACCTTGTTGTCAC 
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TGTTGGCTATCTGGTTGACTTCTGGTTTGTTGTTCTTTTTATTTTCTAGGCTCTCTTTTC 
TGGCAATCTTTTTCCTCTCCTTTTCTTCCGCTTCTTCTTTCTGAGCAGTGTTCAGACTTT 
CAGCATCAGCCAAATGGTCTA 

5 BF807128 

TCAAAGTCGAAGGAGGATCTCCGCGTGGGATGCTGGGGTGGGAGGTAGTAGGCGTCTCCT 
TGGAGATCTCCGTGAATCATAGCAAACGGGCGAGTCATCGTCCTACAAGAATCCTAGTGG 
ATGATGGTAGCCTCGGGGCCTCTCAGAGTCGATGTTTCTGCCTGG 

10 BF806160 

CTCGCCCGTTTGCTATGAGTCACGGAGATCTCCAAGGAGACGCCTACTACCTCCCACCCC 
AGCATCCCACCGGAGATCCTCCTTCAACTTTGAGTGCCTGCGCCGGCAGAGCAGCCAGGA 
AGAGGTCCCGTCGTCTCCCATCTTCCCCCATCGCACGGCCCTGCCTCTGCATCTAATGCA 
GCAACAGATCATGGCAGTTGCCGGCCTAGATTCAAGTAAAGCCCAGAAGTACTCACCGAG 
1 5 TCACTCGACCCGGCCGTGGGCCACCCCTCCAGCAACCCCTCCCTACCGGGACTGGACACC 
GTGCTACACCCCCCAGATGACGCCGATGTA 

BF805244 

CCAGGCAGAAACATCGACTCTGAGAGGCCCCGAGGCTACCATCATCCCCAAGGATTCTTG 
20 GAGGACGATGACTCGCCCGTTTGCTATGATTCACGGAGATCTCCAAGGAGACGCCTACTA 
CCTCCCACCCCAGCATCCCACCGGAGATCCTCCTTCAACTTTGAGTGCCTGCGCCGGCAG 
AGCAGCCAGGAAGAGGTCCCGTCGTCTCCCATCTTCCCCCATCGCACGGCCCTGCCTCTG 
CATCTAATGCAGCAACAGATCATGGCAGTTGCCGGCCTAGATTCAAGTAAAGCCCAGAAG 
TACTCACCGAGTCACTCGACCCGGTCGTGGGCCACCCCTCCAGCAACCCCTCCCTACCGG 
25 GACTGGACACCGTGCTACACCCCCCAGATGACGCCGATGTA 

BF805235 

TACATCGGCGTCATCTGGGGGGTGTAGCACGGTGTCCAGTCCCGGTAGGGAGGGGTTGCT 
GGAGGGGTGGCCCACGACCGGGTCGAGTGACTCGGTGAGTACTTCTGGGCTTTACTTGAA 

30 TCTAGGCCGGCAACTGCCATGATCTGTTGCTGCATTAGATGCAGAGGCAGGGCCGTGCGA 
TGGGGGAAGATGGGAGACGACGGGACCTCTTCCTGGCTGCTCTGCCGGCGCAGGCACTCA 
AAGTTGAAGGAGGATCTCCGGTGGGATGCTGGGGTGGGAGGTAGTAGGCGTCTCCTTGGA 
GATCTCCGTGAATCATAGCANACGGGCGAGTCATCGTCCTCCAAGAATCCTTGNNGATGA 
TGGTAGCCTCGGNGCCTCTCAGAGTCGATGTTTCTGCCTGNGTATCTGCTCGGGCGAGCC 

35 GGTACCGAGCT 

BF805080 

TACATCGGCGTCATCTGGGGGGTGTAGCACGGTGTCCAGTCCCGGTAGGGAGGGGTTGCT 
GGAGGGGTGGCCCACGACCGGGTCGAGTGACTCGGTGAGTACTTCTGGGCTTTACTTGAA 
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TCTAGGCCGGCAACTGCCATGATCTGTTGCTGCATTAGATGCAGAGGCAGGGCCGTGCGA 
TGGGGGAAGATGGGAGACGACGGGACCTCTTCCTGGCTGCTCTGCCGGCGCAGGCACTCA 
AAGTTGAAGGAGGATCTCCGGTGGGATGCTGGGGTGGGAGGTAGTAGGCGTCTCCTTGGA 
GATCTCCGTGAATCATAGCAAACGGGCGAG 

5 

T27949 

GCGGACAGCTTGGTGGAGGCAGTCCTGATATCCGAAGCCTTNGGACGCTATGCAAGGGAC 
CCAAAATTTNTTTCAGCAACAAAACACGAAATCGCTGATGCCTGTAACCTCACCATCGAC 
GAGATGGAGAGTNCAGCCAGCACCCTGCTTAATGGGAACGTGCGTCCCCGAGCCAACGGG 
10 GAT 

BE836638 

AAGAAATAGGAGGATAAGAATATCATATTTCAGAGCTGCCCTGGCACAGTACCTGCCCCG 
TCGGAGGCTCTCACTGGCAAATGACAGCTCTGTGCAAGGAGCACTCCCAAGTATAAAAAT 
15 TATTACACAGTT 

BE770685 

CCATTGGTACGAGAGAAATTAGGAGGATAAGATTATCTATTATTCTGAGCTGCCCTGGCA 
CAGTACCTGCCCCGTCGGAGGCTCTCACTGGCAAATGACAGCTCTGTGCAAGGAGCACTC 
20 CCAAGTATAAAAATTATTACATAGTTTTATTCTGAAGAACATTTTGCATTTTAATAAAAA 
AGGATTTATGTCAGGAAAGAGTCATTTACATACCTTGAATTGTTTTTGCCTGGATCAGAG 
TAAGAATGTCTTAAGAAGAGGTTTGTAAGGTCTTCATAACAAAGTGGTGTTTGTTATTTA 
CAAAAAAAAAAAAAAAAAAAATTTTTATACCGGGTTTGTCTGTATACAAATTTCTCTG 

25 BE769065 

TCCAGAGTAGAAGAAATCAGCCAAGTATCATTTATTCAGCGAAAATCCTCTGGGGATTAA 
AATTTTAAGTTTGAAAGAACTTGACACTACAGAAATTTTTCTAAAATATTTTGAGTCACT 
ATAAACCTATCATCTTTCCACAAGATATACCAGATGACTATTTGCAGTCTTTTCTTTGGG 
CAAGAGTTCCATGATTTTGATACTGTACCTTTGGATCCACCATGGGTTGCAACTGTCTTT 
30 GGTTTTGTTTGTTTGACTTGAACCACCCTCTGGAAAGCTACTCTGGAAA 

Sequences identified as those of HOXB13 cluster 



BF676461 

35 GGGATTCCCCCGGCTGGGTGGGGAGAGCGAGCTGGGTGCCCCCATAGATTCCCCTGCCCG 
AACCTCATGAGCCGACCCTCGGCTCCATGGAGCCCGGAAATTATGCCACCTTGGATGGAG 
CCAAGGATATCGAAGGCTTGTTGGGAGCGGGAGGGGGGCGGAATCTGGTCGCCCACTCCC 
TCTCTGACCAGCCACCCAGCGCGCTACGCTTGATGCCTGTGTCAATATGCCCCCTTGATC 
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TGCCAGGCTCGGGGAGCGGCCAAAAGCAATGCCCACCCTATGCTCTGGGGGTGCCCAGGG 
GACTGTCCCCGGCTCCGTGCCTTATGGTTACTGTGGGGCGGGGTACATACTCCTGCAGAG 
TTGTCCCGGAGCTCGTTGAAACCTTGTGCCGAGGAGAGCCACCCTGGCGGTACCCGGGAA 
GACTCCCCAGGGCGGGAAGAGTACCCCAGCGGCCCAATGAGTTGTGCTTCTATCGGGATA 
5 TCCGGGACCTACCAGGCCTATGTGCAGGTACTGGACGTGTCCTGTGCTGCAGACTCTGGG 
TGTCCGTGGAGCACCGGACATTGGCTCGCTGTGGCCTGTGGCCGGTACCAGTCTTGGGCT 
CTCGGTGTGTGGCTGGACACGCCGGTTGTGTTCGCGGGAGACCGCACCCACCAGGTTCCT 
TTGGGAGGGCCGCTTTGCAGACTCCGGGGGAGGCCCCTCTGAGGCGGGGCCTTTTCGGGG 
GGGCGAAGAAAGCTTTCCGACGCAGGCGCTTGCGGAGCTGGCGGGACATCGGGACACTTC 
10 ACCCAGCGAAGCGCGGCTTGGGGCCCCTCTGGGCGCGGTCTCGGTTGACACCGGCGAAGA 
GTTTCGGGAGAGGCCCATATCTTCTGGGGAGGGCGTTGCGTCGCCCCCG 

BC007092 

ggattccccc ggcctgggtg gggagagcga gctgggtgcc ccctagattc cccgcccccg 

15 cacctcatga gccgaccctc ggctccatgg agcccggcaa ttatgccacc ttggatggag 
ccaaggatat cgaaggcttg ctgggagcgg gaggggggcg gaatctggtc gcccactccc 
ctctgaccag ccacccagcg gcgcctacgc tgatgcctgc tgtcaactat gcccccttgg 
atctgccagg ctcggcggag ccgccaaagc aatgccaccc atgccctggg gtgccccagg 
ggacgtcccc agctcccgtg ccttatggtt actttggagg cgggtactac tcctgccgag 

20 tgtcccggag ctcgctgaaa ccctgtgccc aggcagccac cctggccgcg taccccgcgg 
agactcccac ggccggggaa gagtacccca gccgccccac tgagtttgcc ttctatccgg 
gatatccggg aacctaccag cctatggcca gttacctgga cgtgtctgtg gtgcagactc 
tgggtgctcc tggagaaccg cgacatgact ccctgttgcc tgtggacagt taccagtctt 
gggctctcgc tggtggctgg aacagccaga tgtgttgcca gggagaacag aacccaccag 

25 gtcccttttg gaaggcagca tttgcagact ccagcgggca gcaccctcct gacgcctgcg 
cctttcgtcg cggccgcaag aaacgcattc cgtacagcaa ggggcagttg cgggagctgg 
agcgggagta tgcggctaac aagttcatca ccaaggacaa gaggcgcaag atctcggcag 
ccaccagcct ctcggagcgc cagattacca tctggtttca gaaccgccgg gtcaaagaga 
agaaggttct cgccaaggtg aagaacagcg ctacccctta agagatctcc ttgcctgggt 

30 gggaggagcg aaagtggggg tgtcctgggg agaccaggaa cctgccaagc ccaggctggg 
gccaaggact ctgctgagag gcccctagag acaacaccct tcccaggcca ctggctgctg 
gactgttcct caggagcggc ctgggtaccc agtatgtgca gggagacgga accccatgtg 
acagcccact ccaccagggt tcccaaagaa cctggcccag tcataatcat tcatcctgac 
agtggcaata atcacgataa ccagtactag ctgccatgat cgttagcctc atattttcta 

35 tctagagctc tgtagagcac tttagaaacc gctttcatga attgagctaa ttatgaataa 
atttggaaaa aaaaaaaaaa aaaaaaaaaa aaaaaa 

BM462617 

ATTCCCCCGGCCTGGGTGGGGAGAGCGAGCTGGGTGCCCCCTAGATTCCCCGCCCCCGCA 
40 CCTCATGAGCCGACCCTCGGCTCCATGGAGCCCGGCAATTATGCCACCTTGGATGGAGCC 
AAGGATATCGAAGGCTTGCTGGGAGCGGGAGGGGGGCGGAATCTGGTCGCCCACTCCCCT 
CTGACCAGCCACCCAGCGGCGCCTACGCTGATGCCTGCTGTCAACTATGCCCCCTTGGAT 
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CTGCCAGGCTCGGCGGAGCCGCCAAAGCAATGCCACCCATGCCCTGGGGTGCCCCAGGGG 
ACGTCCCCAGCTCCCGTGCCTTATGGTTACTTTGGAGGCGGGTACTACTCCTGCCGAGTG 
TCCCGGAGCTCGCTGAAACCCTGTGCCCAGGCAGCCACCCTGGCCGCGTACCCCGCGGAG 
ACTCCCACGGCCGGGGAAGAGTACCCCAGCCGCCCCACTGAGTTTGCCTTCTATCCGGGA 
5 TATCCGGGAACCTACCAGCCTATGGCCAGTTACCTGGACGTGTCTGTGGTGCAGACTCTG 
GGTGCTCCTGGAGAACCGCGACATGACTCCCTGTTGCCTGTGGACAGTTACCAGTCCTGG 
GCTCTCGCTGGTGGCTGGAACAGCCAGATGTGTTGCCAGGGAGAACAGAACCCACCAGGT 
CCCCTTTTGGAAGGCAGCATTTGCAGACTCCAGCGGGCAGCACCCTCCTGACGCCTGCGC 
CTTTCGT 

10 

BG752489 

GCAGGCGACTTGCGAGCTGGGAGCGATTTAAAACGCTTTGGATTCCCCCGGCCTGGGTGG 
GGAGAGCGAGCTGGGTGCCCCCTAGATTCCCCGCCCCCGCACCTCATGAGCCGACCCTCG 
GCTCCATGGAGCCCGGCAATTATGCCACCTTGGATGGAGCCAAGGATATCGAAGGCTTGC 

1 5 TGGGAGCGGGAGGGGGGCGGAATCTGGTCGCCCACTCCCCTCTGACCAGCCACCCAGCGG 
CGCCTACGCTGATGCCTGCTGTCAACTATCCCCCCTTGGATCTGCCAGGCTCGGCGGAGC 
CGCCAAAGCAATGCCACCCATGCCCTGGGGTGCCCCAGGGGACGTCCCCAGCTCCCGTGC 
CTTATGGTTACTTTGGAGGCGGGTACTACTCCTGCCGAGTGTCCCGGAGCTCGCTGAAAC 
CCTGTGCCCAGGCAGCCACCCTGGCCGCGTACCCCGCGGAGACTCCCACGGCCGGGGAAG 

20 AGTACCCCAGCCGCCCCACTGAGTTTGCCTTCTATCCGGGATATCCGGGAACCTACCAGC 
CTATGGCCAGTTACCTGGACGTGTCTGTGGTGCAGACTCTGGGTGCTCCTGGAGAACCGC 
GACATGACTCCCTGTTGCCTGTGGACAGTTACCAGTCTTGGGCTCTCGCTGGTGGCTGGA 
ACAGCCAGATGTGTTGCCAGGGAGAACAGAAGCCACCAGGTCCCTTTTGGAAGGCAGCAT 
CTGCAGACTCCAGCGGGCAGGACCTCCTGACGCCTGCGGCCTTTCGTCGCGAGCGCAAGA 

25 AACGCATTCCGTA 

BG778198 

GGATTTAAAACGCTTTGGATTCCCCCGGCCTGGGTGGGGAGAGCGAGCTGGGTGCCCCCT 
AGATTCCCCGCCCCCGCACCTCATGAGCCGACCCTCGGCTCCATGGAGCCCGGCAATTAT 

30 GCCACCTTGGATGGAGCCAAGGATATCGAAGGCTTGCTGGGAGCGGGAGGGGGGCGGAAT 
CTGGTCGCCCACTCCCCTCTGACCAGCCACCCAGCGGCGCCTACGCTGATGCCTGCTGTC 
AACTATGCCCCCTTGGATCTGCCAGGCTCGGCGGAGCCGCCAAAGCAATGCCACCCATGC 
CCTGGGGTGCCCCAGGGACGTCCCCAGCTCCCGTGCCTTATGGTTACTTTGGAGGCGGGT 
ACTACTCCTGCCGAGTGTCCCGGAGCTCGCTGAAACCCTGTGCCCAGGCAGCCACCCTGG 

35 CCGCGTACCCCGCGGAGACTCCCACGGCCGGGGAAGAGTACCCCAGCCGCCCCACTGAGT 
TTGCCTTCTATCCGGGATATCCGGGAACCTACCAGCCTATGGCCAGTTACCTGGACGTGT 
CTGTGGTGCAGACTCTGGGTGCTCCTGGAGAACCGCGACATGACTCCCTGTTGCCTGTGG 
ACAGTTACCAGTCTTGGGCTCTCGCTGGTGGGCTGGAACAGCCAGATGTGTTGCCAGCGC 
AGAACAGAACCCACCAGGTCCCTTTTGGAAGGCAGCATTTGCAGACTCCAGCGGGCAGAA 

40 CCCTCCTGACGCCTGCGCCTTTCGTTCGCGGGCGAAAAA 

CB050884 
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AAGAAACGCATTCCGTACAGCAAGGGGCAGTTGCGGGAGCTGGAGCGGGAGTATGCGGCT 
AACAAGTTCATCACCAAGGACAAGAGGCGCAAGATCTCGGCAGCCACCAGCCTCTCGGAG 
CGCCAGATTACCATCTGGTTTCAGAACCGCCGGGTCAAAGAGAAGAAGGTTCTCGCCAAG 
GTGAAGAACAGCGCTACCCCTTAAGAGATCTCCTTGCCTGGGTGGGAGGAGCGAAAGTGG 
5 GGGTGTCCTGGGGAGACCAGGAACCTGCCAAGCCCAGGCTGGGGCCAAGGACTCTGCTGA 
GAGGCCCCTAGAGACAACACCCTTCCCAGGCCACTGGCTGCTGGACTGTTCCTCAGGAGC 
GGCCTGGGTACCCAGTATGTGCAGGGAGACGGAACCCCATGTGACAGCCCACTCCACCAG 
GGTTCCCAAAGAACCTGGCCCAGTCATAATCATTCATCCTGACAGTGGCAATAATCACGA 
TAACCAGTACTAGCTGCCATGATCGTTAGCCTCATATTTTCTATCTAGAGCTCTGTAGAG 
1 0 CACTTTAGAAACCGCTTTCATGAATTGAGCTAATTATGAATAAATTTGGAAGGCGAAAAA 
AAAAACCTCGTGCC 



CB050885 

ATTCGGCACGAGGTTTTTTTTTTCGCCTTCCAAATTTATTCATAATTAGCTCAATTCATG 
1 5 AAAGCGGTTTCTAAAGTGCTCTACAGAGCTCTAGATAGAAAATATGAGGCTAACGATCAT 
GGCAGCTAGTACTGGTTATCGTGATTATTGCCACTGTCAGGATGAATGATTATGACTGGG 
CCAGGTTCTTTGGGAACCCTGGTGGAGTGGGCTGTCACATGGGGTTCCGTCTCCCTGCAC 
ATACTGGGTACCCAGGCCGCTCCTGAGGAACAGTCCAGCAACCAGTGGCCTGGGAAGGGT 
GTTGTCTCTAGGGGCCTC 

20 

BF965191 

GGGTGGGGAGAGCGAGCTGGGTGCCCCCTAGATTCCCCGCCCCCGCACCTCATGAGCCGA 
CCCTCGGCTCCATGGAGCCCGGCAATTATGCCACCTTGGATGGAGCCAAGGATATCGAAG 
GCTTGCTGGGAGCGGGAGGGGGGCGGAATCTGGTCGCCCACTCCCCTCTGACCAGCCACC 

25 CAGCGGCGCCTACGCTGATGCCTGCTGTCAACTATGCCCCCTTGGATCTGCCAGGCTCGG 
CGGAGCCGCCAAAGCAATGCCACCCATGCCCTGGGGTGCCCCAGGGGACGTCCCCAGCTC 
CCGTGCCTTATGGTTACTTTGGAGGCGGGTACTACTCCTGCCGAGTGTCCCGGAGCTCGC 
TGAAACCCTGTGCCAGGCAGCCACCCTGGCCGCGTAACCCGACGGAGACTCTCACGTGCG 
GGGAAGAGTACCCCTAGCGCCCCACATGAGTTTGCCTTCTATCCGGGATATCCGGGACCG 

30 TACCAGCCTATGGCAGTTACCTGGACGTGTCTGTGGTGCCGACTCTGGGTGCTCCTGGAG 
AACCGCGGACATGACTCCTTGTTTGCTGTGCGACGCTCACCAGTCTGGGCTCCTCGTCGG 
TGGTCGCACTCCCACTTTTTGCCGGGCGACATCCCCCGGGGCCCCTTCCGGAACAGCGAC 
CTTGCGAGCCCCCGGGGACACACCCCCGTAAGCGGCCTATCATCGCTGATAAACCTCATC 
AGAGGGCACCGAAAGCCGCGACTCTAACCCCCCCACTACGACTCACGACCGCACAGGTAC 

35 TCGAACCGCCCAATATCTGGTTCTAACCCATGGCGCATCTCAGCCGCTAGAGAGCCAACC 
AAACGCGCCACGCGCAACCACACTACACCACGGCACCCCTTTCATCTCACTCCCACGCCG 
ATCACTCTTCACCCTCCAGAATCATTCCCCTCGCACATCCTACCTATCTCATGCCTCCCA 
GTTCACCCCATTCCCTCCCCTAATCTCACCCACACATTCACGCACGTTCTCACTACGCTT 
CGCTCCGACCCACATCCTCACCCCCACATTCATACCACTTCACCATCACGACCCCCCCCT 

40 CTCATCGACTCCTGTCTCATTCTCAACCACAGTACTACCAGCTCCAACACACCACTCACC 
CCAAGCTATCCATCACCTACACGCTTTCACCCCTCACCGCTCCCAAGTAATTCAGATCAC 
TCAAACACAATCTGCTACATACTCATCCCTCCCCCACTCCCAGTACAGTCCAACCACCGA 
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CCAACTACCTCCGCGCCACCCGCGCCGCCCCACCTCACCGGCCCCAACCGCCCGCACAGG 
GCACGCACCCCCCGGCAACCGCGCGATCCGGCCGTACACACTCTTGGGCGGCACGCAGCT 
GAGGACATTCCGCGGGAGCGCCCCACCGTGGGCTACGTGGGTCGCGACCCGGCGGGGCGC 
GTGCGGCGTCGCCCGCCCGCCCGCCGACTGCGACCCAGTCGAG 

BU930208 

GGGGCTTTGGATTCCCCCGGCCTGGGTGGGGAGAGCGAGCTGGGTGCCCCCTAGATTCCC 
CGCCCCCGCACCTCATGAGCCGACCCTCGGCTCCATGGAGCCCGGCAATTATGCCACCTT 
GGATGGAGCCAAGGATATCGAAGGCTTGCTGGGAGCGGGAGGGGGGCGGAATCTGGTCGC 
CCACTCCCCTCTGACCAGCCACCCAGCGGCGCCTACGCTGATGCCTGCTGTCAACTATGC 
CCCCTTGGATCTGCCAGGCTCGGCGGAGCCGCCAAAGCAATGCCACCCATGCCCTGGGGT 
GCCCCAGGGGACGTCCCCAGCTCCCGTGCCTTATGGTTACTTTGGAGGCGGGTACTACTC 
CTGCCGAGTGTCCCGGAGCTCGCTGAAACCCTGTGCCCAGGCAGCCACCCTGGCCGCGTA 
CCCCGCGGAGACTCCCACGGCCGGGGAAGAGTACCCCAGCCGCCCCACTGAGTTTGCCTT 
CTATCCGGGATATCCGGGAACCTACCAGCCTATGGCCAGTTACCTGGACGTGTCTGTGGT 
GCAGACTCTGGGTGCTCCTGNAGAACCGCGACATGACTCCCTGTTGCCTGTGGACAGTTA 
CCAGTCTTGGGCTCTCGCTGGTGGCCTGGAACAGCCCAGATGTGTTTGCCCAGGGNAGAA 
CACGAACCCCACCCGGTTCCCCCTTTTGGGAAAGGGCAGCCATTTTGGCCAGCCTTCCAA 
GCGGGGCCAACCACCCCCTCCCCTGGACAGGCCCTGGT 

AA807966 

GCGGCCGCAAGAAACGCATTCCGTACAGCAAGGGGCAGTTGCGGGACTGGAGCGGGAGTA 
TGCGGCTAACAAGTTCATCACCAAGGACAAGAGGCGCAAGATCTCGGCAGCCACCAGCCT 
CTCGGAGCGCCAGATTACCATCTGGTTTCAGAACCGCCGGGTCAAAGAGAAGAAGGTTCT 
CGCCAAGGTGAAGAACAGCGCTACCCCTTAAGAGATCTCCTTGCCTGGGTGGGAGGAGCG 
AAAGTGGGGGTGTCCTGGGGAGACCAGGAACCTGCCAAGCCCAGGCTGGGGCCAAGGACT 
CTGCTGAGAGGCCCCTAGAGACAACACCCTTCCCAGGCCACTGGCTGCTGGACTGTTCCT 
CAGGAGCGGCCTGGGTACCCAGTATGTGCAGGGAGACGGAACCCCATGTGACAGCCCATT 
CCACCAGGGTTCCCAAAGAACCTGGCCCAGTCATAATCATTCATCCTGACAGTGGC 

AI884491 

AGCGGCCGCAAGAAACGCATTCCGTACAGCAAGGGGCAGTTGCGGGAGCTGGAGCGGGAG 
TATGCGGCTAACAAGTTCATCACCAAGGACAAGAGGCGCAAGATCTCGGCAGCCACCAGC 
CTCTCGGAGCGCCAGATTACCATCTGGTTTCAGAACCGCCGGGTCAAAGAGAAGAAGGTT 
35 CTCGCCAAGGTGAAGAACAGCGCTACCCCTTAAGAGATCTCCTTGCCTGGGTGGGAGGAG 
CGAAAGTGGGGGTGTCCTGGGGAGACCAGGAACCTGCCAAGCCCAGGCTGGGGCCAAGGA 
CTCTGCTGAGAGGCCCCTAGAGACAACACCCTTCCCAGGCCACTGGCTGCTGGACTGTTC 
CTCAGGAGCGGCCTGGGTACCCAGTATGTGCAGGGAGACGGAACCCCATGTGACAGCCCA 
CTCCACCAGGGTTCCCAAAGAACCTGGCCCAGTCATAATCATTCATCCTGACAGTGGCAA 
40 TAATCACGATAACCAGTACTAGCTGCCATGATCGTTAGCCTCATATTTTCTATCTAGAGC 
TCTGTAGAGCAC 
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AA652388 

GCGGCCGCAAGAAACGCATTCCGTACAGCAAGGGGCAGTTGCGGGACTGGAGCGTGAGTA 
TGCGGCTAACAAGTTCATCACCAAGGACAAGAGGCGCAAGATCTCGGCAGCCACCAGCCT 
5 CTCGGAGCGCCAGATTACCATCTGGTTTCAGAACCGCCGGGTCAAAGAGAAGAAGGTTCT 
CGCCAAGGTGAAGAACAGCGCTACCCCTTAAGAGATCTCCTTGCCTGGGTGGGAGGAGCG 
AAAGTGGGGGTGTCCTGGGGAGACCAGGAACCTGCCAAGCCCAGGCTGGGGCCAAGGACT 
CTGCTGAGAGGCCCCTAGAGACAACACCCTTCCCAGGCCACTGGCTGCTGGACTGTTCCT 
CAGGAGCGGCCTGGGTACCCAGTATGTGCAGGGAGACGGAACCCCATGTGACAGCCCACT 
10 CCACCAGGGTTCCCAAAGAACCTGGCC 

BF446158 

TTTTTTTTTTTTTTTTTTTCGCCTTCCAAATTTATTCATAATTAGCTCAATTGATGAAAG 
CGGTTTCTAAAGTGCTCTACAAAGCTCTAAATAAAAAATATGAGGCTAACGATCATGGCA 
1 5 GCTAGTACTGGTTATCGGGATTATTGCCACTGTCAGGATGAATGATTATGACTGGGCCAG 
GTTCTTTGGGAACCCTGGTGGAGTGGGCTGTCACATGGGGTTCCGTCTCCCTGCACATAC 
TGGGTACCCAGGCCGTTCCTGAGGAACAGTCCACCACCCAGTGGCCTGGGAAGGGTGTTG 
TCTCTAGGGGCCTCTCAACAAAGTCCTTGGCCCCAGCCTGGGCTTGGCAGGTTCCTGGTC 
TCCCCAGGACACCCCCACTTTCGCTCCTCCCACCCAGGCAAGGAGATCTCTTAAGGGG 

20 

AA657924 

GACGCNAGGTATGCGGCTAACAAGTTCATCACCAAGGACAAGAGGCGCAAGATCTCGGCA 
GCCACCAGCCTCTCGGAGCGCCAGATTACCATCTGGTTTCAGAACCGCCGGGTCAAAGAG 
AAGAAGGTTCTCGCCAAGGTGAAGAACAGCGCTACCCCTTAAGAGATCTCCTTGCCTGGG 
25 TGGGAGGAGCGAAAGTGGGGGTGTCCTGGGGAGACCAGGAACCTGCCAAGCCCAGGCTGG 
GGCCAAGGACTCTGCTGAGAGGCCCCTAGAGACAACACCCTTCCCAGGCCACTGGCTGCT 
GGACTGTTCCTCAGGAGCGGCCTGGGTACCCATGTATGTGCAGGGAGACGGAACCCCATG 
TGACAGCCCACTCCACCAGNGTTCCTAAAGAACCCTGGCCAGTCA 

30 AA644637 

GCAGGCGACTTGCGAGCTGGGAGCGGTTTAAAACGCTTTGGATTCCCCCGGCCTGGGTGG 
GGAGAGCGAGCTGGGTGCCCCCTAGATTCCCCGCCCCCGCACCTCATGAGCCGACCCTCG 
GTCCATGGACACGGCAATTATGCCACCTTGGATGGAGCCAAGGATATCGAAGGCTTGCTG 
GGAGCGGGAGGGGGGCGGAATCTGGTCGCCCACTCCCCTCTGACCAGCCACCCAGCGGCG 
35 CCTACGCTGATGCCTGCTGTCAACTATGCCCCCTTGGATCTGCCAGGCTCGGCGGACTCT 
NAAAGCATATGCCACCCNATGCCCTGGGGTGCCCCAGGGGAACGTCCCCAGCTCCCGTGC 
CTTATGGTT 

BF222357 
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GCGGCCGCAAGAAACGCATTCCGTACAGCAAGGGGCAGTTGCGGGAGCTGGAGCGGGAGT 
ATGCGGCTAACAAGTTCATCACCAAGGACAAGAGGCGCAAGATCTCGGCAGCCACCAGCC 
TCTCGGAGCGCCAGATTACCATCTGGTTTCAGAACCGCCGGGTCAAAGAGAAGAAGGTTC 
TCGCCAAGGTGAAGAACAGCGCTACCCCTTAAGAGATCTCCTTGCCTGGGTGGGAGGAGC 
5 GAAAGTGGGGGTGTCCTGGGGAGACCAGGAACCTGCCAAGCCCAGGCTGGGGCCAAGGAC 
TCTGCTGAGAGGCCCCTAGAGACAACACCCTTCCCAGGCCACTGGCTGCTGGACTGTTCC 
TCAGGAGCGGCCTG 

AA527613 

1 0 GTCGACGAACAGCGCTACCCCTTAAGAGATCTCCTTGCCTGGGTGGGAGGAGCGAAAGTG 
GGGGTGTCCTGGGGAGACCGGGAACTGCCAAGCCCAGGCTGGGGCAAGGACTCTGCTGAG 
AGGCCCCTAGAGACAACACCCTTCCCAGGCCACTGCTGCTGGACTGTTCCTCAGGAGCGG 
CCTGGGTACCCAGTATGTGCAGGGAGACGGAACCCCATGTGACAGCCCACTCCACCAGGG 
TTCCCAAAGAACCTGGCCCAGTCATAATCATTCATCCTGACAGTGGCAATAATCACGATA 

1 5 ACCAGTACTCAGCTGCCATGATCGTTAGCCTCATATT 

AA533227 

GCGTCGACCCCTTGAAGAGATCTCCTTGCCTGGGTGGGAGGAGCGAAAGTGGGGGTGTCC 
TGGGGAGACCAGGAACCTGCCAAGCCCAGGCTGGGGCCAAGGACTCTGCTGAGAGGCCCC 

20 TAGAGACAACACCCTTCCCAGGCCACTGGCTGCTGGACTGTTCCTCAGGAGCGGCCTGGG 
TACCCAGTATGTGCAGGGAGACGGAACCCCATGTGACAGCCCACTCCACCAGGGTTCCCA 
AAGAACCTGGCCCAGTCATAATCATTCATCCTGACAGTGGCAATAATCACGATAACCAGT 
ACTAGCTGCCATGATCGTTAGCCTCATATTTTCTATCTAGAGCTCTGTAGAGCACTTGTA 
GAAACCGCTTTCATGAATTGAGCTAATTATGAATAGATTTGGAAGGGGAAAAAAGTGGAA 

25 AAAGTTTTGCCCAAAGTGGGTCGTTTACGTCG 

AA456069 

CTCCCTGGCAACACATCTGGCTGTTCCAGCACCAGCGAGACCCAAGACTGGTAACTGTCC 
ACAGGCAACAGGGAGTCATGTCGCGGTTCTCCAGGAGCACCCAGAGTCTGCACCACAGAC 
30 ACGTCCAGGTAACTGGCCATAGCTGAGTAGGTTCCCGGATATCCCGGATAGAAGGCAAAC 
TCAGTGGGGCGGCTGGGGTACTCTTCCCCGGCCGTGGAGAGTCTCCGCGGGGTACGGCCC 
AGGGTGGCTGCCTGGGCATCAGGGTTTCAGCGAGCTCCGGGACACTCGGCAGGAGTAGTA 
CCCGCCTCCAAAGTAACCATAAGGCACGGGAGCTGGGGACGTCCCTGGGGCACCCCAG 

35 AA455572 

TTTAAAACGCTTTGGATTCCCCCGGCCTGGGTGGGGAGAGCGAGCTGGGTGCCCCCTAGA 
TTCCCCGCCCCCGCACCTCATGAGCCGACCCTCGGTCCATGGAGCCGGCGAATTATGCCA 
CCTTGGATGGAGCCAAGGATATCGAAGGCTTGCTGGGAGCGGGAGGGGGGCGGAATCTGG 
TCGCCCACTCCCCTCTGACCAGCCACCCAGCGGCGCTACGTGATGCCTGCTGTCAACTAT 
40 GCCCTTGGATCTGCCAGCTCGCGGAGCCAAAGCAATGCCACCCATGCCCTGGGGTGCCCC 
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AGGTGACGTCCCCAGCTCCCGTGCCTTATGGTTACTTTGGAGGCGGGTACTACTCCTGCC 
GAGTGTCCCGGAGCTCGCTGAAACCCTGTGCCCAGGCAGCCACCCTGGCCGCGTACCCCG 
CGATGACTCCCACGGCCGGGGAAGAGTACCCCAGCCGCCCCACTGAGTTTGCCT 

5 BX1 17624 

CAGGCGACTTGCGAGTCTGGGAGCGATTTAAAACGCTTTGGATTCCCCCGGCCTGGGTGG 
GGAGAGCGAGCTGGGTGCCCCCTAGATTCCCCGCCCCCGCACCTCATGAGCCGACCCTCG 
GCTCCATGGAGCCCGGCAATTATGCCACCTTGGATGGAGCCAAGGATATCGAAGGCTTGC 
TGGGAGCGGGAGGGGGGCGGAATCTGGTCGCCCACTCCCCTCTGACCAGCCACCCAGCGG 

1 0 CGCCTACGCTGATGCCTGCTGTCAACTATGCCCCCTTGGATCTGCCAGGCTCGGCGGAGC 
CGCCAAAGCAATGCCACCCATGCCCTGGGGTGCCCCAGGGGACGTCCCCAGCTCCCGTGC 
CTTATGGTTACTTTGGAGGCGGGTACTACTCCTGCCGAGTGTCCCGGAGCTCGCTGAAAC 
CCTGTGCCCAGGCAGCCACCCTGGCCGCGTACCCCGCGGAGACTCCCACGGCCGGGGAAG 
AGTACCCCAGCCGCCCCACTGAGTTTGCCTTCTATCCGGGATATCCGGGAACCTACCAGC 

1 5 CTATGGCCAGTTACCTTGGACGTGTCTGTGGTGCAGACTCTGGGTGCTCCTGGAGAACCG 
CGACATGACTCCCTGNTGCCTGTGGACAGTTACCAGTCTTGGGCTCTCGCTGGTGGCTGG 
AACAGCCAGATGTGTTGNCAGGGAGAACAGAACCCACCAGGTCCCTTTTGGAAGGCAGAT 
TTGCAGACTNCAGCGGGCA 

20 BQ673782 

AGGCAGCCACCCTGGCCGCGTACCCCGCGGAGACTCCCACGGCCGGGGAAGAGTACCCCA 
GCCGCCCCACTGAGTTTGCCTTCTATCCGGGATATCCGGGAACCTACCAGCCTATGGCCA 
GTTACCTGGACGTGTCTGTGGTGCAGACTCTGGGTGCTCCTGGAGAACCGCGACATGACT 
CCCTGTTGCCTGTGGACAGTTACCAGTCTTGGGCTCTCGCTGGTGGCTGGAACAGCCAGA 

25 TGTGTTGCCAGGGAGAACAGAACCCACCAGGTCCCTTTTGGAAGGCAGCATTTGCAGACT 
CCAGCGGGCAGCACCCTCCTGACGCCTGCGCCTTTCGTCGCGGCCGCAAGAAACGCATTC 
CGTACAGCAAGGGGCAGTTGCGGGAGCTGGAGCGGGAGTATGCGGCTAACAAGTTCATCA 
CCAAGGACAAGAGGCGCAAGATCTCGGCAGCCACCAGCCTCTCGGAGCGCCAGATTACCA 
TCTGGTTTCAGAACCGCCGGGTCAAAGAGAAGAAGGTTCTCGCCAAGGTGAAGAACAGCG 

30 CTACCCCTTAAGAGATCTCCTTGCCTGGGTGGGAGGAGCGAAAGTGGGGGTGTCCTGGGG 
AGACCAGGAACCTGCCAAGCCCCAGGCTGGGGCCAAGGACTCTGCTGAGAGGCCCCTAGA 
GACAACACCCTTCCCAGGCCACTGGCTGCTGGACTGTTCCTCAGGAGCGGCCTGAGTACC 
CCGTATGTGCAGGGGAGACGGAACCCCCTGTGACCAGCCCCCCTCCACCCGTGGTCTCCC 
AGATAACCTGGCCCCCACTCATAAATCATTTCTTCCCGGGCCGGGGGCCAATCATTCCCC 

35 GAACTACCCCGGTACCTTATACAATTAGATTGGACATGAATCCTCTCGGGGGCATTCCCT 
ATGGCGCTGAGGCCCCTCACACCT 

AI8 14453 

GGGTGCTGTCCTCTGGAGTCTGCAAATGCTGCCTTCCAAAAGGGACCTGGTGGGTTCTGT 
40 TCTCCCTGGCAACACATCTGGCTGTTCCAGCCACCAGCGAGAGCCCAAGACTGGTAACTG 
TCCACAGGCAACAGGGAGTCATGTCGCGGTTCTCCAGGAGCACCCAGAGTCTGCACCACA 
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GACACGTCCAGGTAACTGGCCATAGGCTGGTAGGTTCCCGGATATCCCGGATAGAAGGCA 
AACTCAATGGGGCGGCTGGGGTACTCTTCCCCGGCCGTGGGAGTCTCCGCGGGGTACGCG 
GCCAGGGTGGCTGCCTGGGCACAGGGTTTCAGCGAGCTCCGGGACACTCGGCAGGAGTAG 
TACCCGCCTCCAAAGTAACCATAAGGCACGGGAGCTGGGGACGTCCCCTGGGGCACCCCA 
5 NGGCATGGGTGGCATTGCTTTGGCGGCTCCGCCGAGCCTGGCAGATCCAAGGGGGCATAG 
TTGACAGCAGGCATCAGCGTAGGCGCCGCTGGGTGGCTGGTCAAAAGGGAGTGGCGACCA 
NATTCCGCCCCCCTCCCGCTTCCCAG 

AI4 17272 

1 0 GGGTGCTGCCCGCTGGAGTCTGCAAATGCTGCCTTCCAAAAGGGACCTGGTGGGTTCTGT 
TCTCCCTGGCAACACATCTGGCTGTTCCAGCCACCAGCGAGAGCCCAGGACTGGTAACTG 
TCCACAGGCAACAGGGAGTCATGTCGCGGTTCTCCAGGAGCACCCAGAGTCTGCACCACA 
GACACGTCCAGGTAACTGGCCATAGGCTGGTAGGTTCCCGGATATCCCGGATAGAAGGCA 
AACTCAGTGGGGCGGCTGGGGTACTCTTCCCCGCCGTGGGAGTCTCCGCGGGGTACGCGG 

15 CCAGGGTGGCTGCCTGGGCACAGGGTTTCAGCGAGCTCCGGGACACTCGGCAGGAGTAGT 
ACCCGCCTCCAAAGTAACCATAAGGCACGGGAGCTGGGGACGTCCCCTGGGGCACCCCAG 
GGCATGGGTGGCATTGCTTTGGCGGCTCCGCCGAGCCTGGCAGATCCAAGGNGGCATAGT 
TGACAGCAGGCATCAGCGTANGCGCCGCTGGGTGGCTGTCAAGAGG 

20 AA535663 

TCGACGTTACCTGGACGTGTCTGTGGTGCAGACTCTGGGTGCTCCTGGAGAACCGCGACA 
TGACTCCCTGTTGCCTGTGGACAGTTACCAGTCTTGGGCTCTCGCTGGTGGCTGGAACAG 
CAGATGTGTTGCCAGGGAGAACAGAACCCACCAGGTCCCTTTTGGAAGGCAGCATTTGCA 
GACTCCAGCGGGCAGCACCCTCCTGACGCCTGCGCCTTTCGTCGCGGCCGCAAGAAACGC 
25 ATTCCGTACAGCAAGGGGCAGTTGCGGGACTGGAGCGGGAGTATGCGGCTAACAAGTTCA 
TCACCAAGGACAAGAGGCGCAAGATCTCGGCAGCCACCAGCCTCTCGGAGCGCCAGATTA 
CCATCTGGTTTCAGAACCGCCGGGTCAAAGAGAAGAAGGTTCTCGCCAAGGTGAAGAACA 
GCGCTACCCCTTAAGAGATCTCCTTGCCTGGGTGGGAGGAGCGAAAGTGTG 

30 AI400493 

GTCAGGAGGGTGCTGCCCGCTGGAGTCTGCAAATGCTGCCTTCCAAAAGGGACCTGGTGG 
GTTCTGTTCTCCCTGGCAACACATCTGGCTGTTCCAGCCACCAGCGAGAGCCCAGGACTG 
GTAACTGTCCACAGGCAACAGGGAGTCATGTCGCGGTTCTCCAGGAGCACCCAGAGTCTG 
CACCACAGACACGTCCAGGTAACTGGCCATAGGCTGGTAGGTTCCCGGATATCCCGGATA 

35 GAAGGCAAACTCAGTGGGGCGGCTGGGGTACTCTTCCCCGGCCGTGGGAGTCTCCGCGGG 
GTACGCGGCCAGGGTGGCTGCCTGGGCACAGGGTTTCAGCGAGCTCCGGGACACTCGGCA 
TGAGTAGACCCGCCTTCCAAGTAACCATAAGGCACGGGAGCTGGTAACGTCCCCTGGGGC 
ACCCCANGGCCATGGGTGCATTGCTTTGGCGGCTCCGCCGAGCCCTGCAGATCCAAGGTG 
GGCATATTGACAGCAGGCATTCACGTATGCGCCCCCTGGGTGGCTGTCATATTGGGGATT 

40 GCGAC 
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AW779219 

GCAGGCGTCAGGAGGGTGCTGCCCGCTGGAGTCTGCAAATGCTGCCTTCCAAAAGGGACC 
TGGTGGGTTCTGTTCTCCCTGGCAACACATCTGGCTGTTCCAGCCACCAGCGAGAGCCCA 
AGACTGGTAACTGTCCACAGGCAACAGGGAGTCATGTCGCGGTTCTCCAGGAGCACCCAG 
5 AGTCTGCACCACAGACACGTCCAGGTAACTGGCCATAGGCTGGTAGGTTCCCGGATATCC 
CGGATAGAAGGCAAACTCAGTGGGGCGACTGGGGTACTCTTCCCGGCCGTGGGGAGTCTC 
CGCGGGGTACGCGGCCAGGGGTGGCTGCCTGGGCACCAGGGGTTTCAGCGAGCTCCGGGA 
CACTCNGCAGGAAANTAGTACCCGCCTCCCAAAGTAACCATAAGCACCGGACTGNGGGNN 
GGACGTCCCCTGGGGCAC 

10 

AA594847 

GCGACCGGACGAAAGGAGGCGTCAGGAGGGTGCTGCCCGCTGGAGTCTGCAAATGCTGCC 
TTCCAAAAGGGACCTGGTGGGTTCTGTTCTCCCTGGCAACACATCTGGCTGTTCCAGCAC 
CAGCGAGACCCAAGACTGGTAACTGTCCACAGGCAACAGGGAGTCATGTCGCGGTTCTCC 
1 5 AGGAGCACCCAGAGTCTGCACCACAGACACGTCCAGGTAACTGGCCATAGCTAGGTAGGT 
TCCCGGATATCCCGGATAGAAGGCAAACTCAGTGGGGCGACTGGGGTACTCTTCCCCGGC 
CGTGGGAGTCTCCGCGGGGTACGCCCATGGGTGGCTGCCTGGGCACAGGGTTTCAGCGAG 
CTCCGGGACA 

20 All 50430 

GCAGGCGTCAGGAGGGTGCTGCCCGCTGGAGTCTGCAAATGCTGCCTTCCAAAAGGGACC 
TGGTGGGTTCTGTTCTCCCTGGCAACACATCTGGCTGTTCCAGCCACCAGCGAGAGCCCA 
AGACTGGTAACTGTCCACAGGCAACAGGGAGTCATGTCGCGGTTCTCCAGGAGCACCCAG 
AGTCTGCACCACAGACACGTCCAGGTAACTGGCCATAGGCTGGTAGGTTCCCGGATATCC 
25 CGGATAGAAGGCAAACTCAGTGGGGCGACTGGGGTACTCTTCCCCGGCCGTGGGAGTCTC 
CGCGGGGTACGCGGCCAGGGTGGCTGCCTGGGCACAGGGTTTCAGCGAGCTCCGGGACAC 
TCGGCAGGAGTAGTACCCGCCTCCAAAGTAACCATAAGGCACGGGAGCTGGATGCGTCCC 
CTAGGGCACCCCATGGCATGGGTGGCATTGCTTTGGCGGCTCCGCCGAGCCTGGCAGATC 
CAAGGAGGCACTGTT 

30 

AA494387 

GGGTGCTGCCCGCTGGAGTCTGCAAATGCTGCCTTCCAAAAGGGACCTGGTGGGTTCTGT 
TCTCCCTGGCAACACATCTGGCTGTTCCAGCCACCAGCGAGACCCAAGACTGGTAACTGT 
CCACAGGCAACAGGGAGTCATGTCGCGGTTCTCCAGGAGCACCCAGAGTCTGCACCACAG 
35 ACACGTCCAGGTAACTGGCCATAGGCTGGTAGGTTCCCGGATATCCCGGATAGAAGGCAA 
ACTCAGTGGGGCGGCTGGGGTACTCTTCCCCGGCCGTGGGAGTCTCCGCGGGGTACGCGT 
CCAGGGTGGCTGCCTGGGCACAGGGTTTCAGCGAGCTCCGGGACACTCGGCAGGAGTAGT 
ACCCGCCTCCAAAGTAACCATAAGGCACGGGAGCTGGGGACGTCCCTG 

40 AA662643 
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GGGTGCTGCCCGCTGGAGTCTGCAAATGCTGCCTTCCAAAAGGGACCTGGTGGGTTCTGT 
TCTCCCTGGCAACACATCTGGCTGTTCCAGCCACCAGCGAGACCCAAGACTGGTAACTGT 
CCACAGGCAACAGGGAGTCATGTCGCGGTTCTCCAGGAGCACCCAGAGTCTGCACCACAG 
ACACGTCCAGGTAACTGGCCATAGGTGGTAGGTTCCCGGATATCCCGGATAGAAGGCAAA 
5 CTCAGTGGGGCGGCTGGGGTACTCTTCCCCGGCCGTGGGAGTCTCCGCGGGGTACGCGGC 
CAGGGTGGCTGCCTGGGCACAGGGTTTCAGCGAGCTCCGGGACA 

AI935940 

GGGTGCTGCCCGCTGGAGTCTGCAAATGCTGCCTTCCAAAAGGGACCTGGTGGGTTCTGT 
10 TCTCCCTGGCAACACATCTGGCTGTTCCTGCCACCAGCGAGAGCCCAAGACTGGTAACTG 
TCCACAGGCAACAGGGAGTCATGTCGCGGTTCTCCAGGAGCACCCAGAGTCTGCACCACA 
GACACGTCCAGGTAACTGGCCATAGGCTGGTAGGTTCCCGGATATCCCGGATAGAAGGCA 
AACTCAGTGGGGCGGCTGGGGTACTCTTCCCCGGCCGTGGGAGTCTCCGCGGGGTACGCG 
GCCAGGGTGGCTGCCTGGGCACAGGGTTTCAGCG 

15 

AA532530 

GGGTGCTGCCCGCTGGAGTCTGCAAATGCTGCCTTCCAAAAGGGACCTGGTGGGTTCTGT 
TCTCCCTGGCAACACATCTGGCTGTTCCAGCCACCAGCGAGACCCAAGACTGGTAACTGT 
CCACAGGCAACAGGGAGTCATGTCGCGGTTCTCCAGGAGCACCCAGAGTCTGCACCACAG 
20 ACACGTCCAGGTAACTGGCCATAGGTNGGTAGGTTCCCGGATATCCCGGATAGAAGGCAA 
' ACTCAGTGGGGCGGCTGGGGTACTCTTCCCCGGCCGTGGGAGTCTCCG 

AA857572 

CTCCCTGGCAACACATCTGGCTGTTCCAGCACCAGCGAGAGCCAAGACTGGTAACTGTCC 
25 ACAGGCAACAGGGAGTCATGTCGCGGTTCTCCAGGAGCACCCAGAGTCTGCACCACAGAC 
ACGTCCAGGTAACTGGCCATAGGTCGGTAGGTTCCCGGATATCCCGGATAGAAGGCAAAC 
TCAGTGGGGCGACTGGGGTACTCTTCCCCGGCCGTGGGAGTCTCCGCGGGGTACGGCNAC 
AGGGTGGCTGCCTGGGCACAGGGTTTCAGCGAGCTCCGGGACACTCGGCAGGAGTAGTAN 
CCGCCTCAAAGTAACCATAANGCACGGGAGCTGGGGACGTCCC 

30 

AI261980 

ACGAAAGGCGCAGGCGTCAGGAGGGTGCTGCCCGCTGGAGTCTGCAAATGCTGCCTTCCA 
AAAGGGACCTGGTGGGTTCTGTTCTCCCTGGCAACACATCTGGCTGTTCCAGCCACCAGC 
GAGAGCCCAAGACTGGTAACTGTCCACAGGCAACAGGGAGTCATGTCGCGGTTCTCCAGG 
35 AGCACCCAGAGTCTGCACCACAGACACGTCCAGGTAACTGGCCATAGGCTGGTAGGTTCC 
CGGATATCCCGGATAGAAGGCAAACTCAGTGGGGCGACTGGGGTACTCTTCCCCGGCCCG 
GGGAGTCTCCGCGGGGTACGCGGCCAGGGTGGCTGCCTGGGCACAGGGTTTCAGCGAGCT 
CCGGGACACTCGGCGGAGNTAGTACCCGCCTCCAAAGTAACCATAAGGCACGGGAGCTGG 
GGAACCGTCCCCTGGGGCACC 

40 
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BE888751.1 

GAGCGAGCTGGGTGCCCCCTAGATTCCCCGCCCCCGCACCTCATGAGCCGACCCTCGGCT 
CCATGGAGCCCGGCAATTATGCCACCTTGGATGGAGCCAAGGATATCGAAGGCTTGCTGG 
GAGCGGGAGGGGGGCGGAATCTGGTCGCCCACTCCCCTCTGACCAGCCACCCAGCGGCGC 
5 CTACGCTGATGCCTGCTGTCAACTATGCCCCCTTGGATCTGCCAGGCTCGGCGGAGCCGC 
CAAAGCAATGCCACCCATGCCCTGGGGTGCCCCAGGGACGTCCCCAGCTCCCGTGCCTTA 
TGGTTACTTTGGAGGCGGGTACTACTCCTGCCGAGTGTCCCGGAGCTCGCTGAAACCCTG 
TGCCCAGGCAGCCACCCTGGCCGCGTACCCCGCGGAGACTCCCACGGCCGGGGAAGAGTA 
CCCCAGCCGCCCCACTGAGTTTGCCTTCTATCCGGGATATCCGGGAACCTACCAGCCTAT 
1 0 GGCCAGTTACCTGGACGTGTCTGTGGTGCAGACTCTGGGTGCTCCTGGAGAACCGCGACA 
TGACTCCCTGTTGCCTGTGGACAGTTACCAGTCTTGGGCTCTCGCTGGTGGCTGGAACAG 
CCAGATGTGTTGCCAGGGAGAACAGAACCCACCAGGTCCCTTTTTGGAAGGCAGCATTTG 
CAGACTCCAGCGGCAGGACCTCCTGAACGCCTGCGCCTTTCGTCGCGGCGTCTAAAGTAA 
TCCTCGAGG 

15 

AI378797 

GCGGCCGCGGCCCACCACCAACTGCTCGCCACCGACCCCACTACTCGCCACCGACCCGCT 
GCTCGGAGCTTCGGTTCTGCGGGTTGTCCAGACTTCAGGCCTGTGCGCTCAATCGTGGAG 
AATGCGCCGGCAGGCCCCCCACCCCCAGCCTAAGGTGCAGGAAGGACCAGCACGAACCCG 

20 CTGGCTTTGCTGCGCGGCCAGGAGATGAGTCCCACCGGGCACTGAGCCCAGGTACAGGAC 
ATCAGAGAATGAACACAGAGGCAGAGGCCCTCATGTCCCTCTCAGAGTCCCGGCTCTGCA 
NAGAGCCCGTCTGTCTCCAGCTTCCAGAATTCCGCACTGTGAATCTGTCTACGTGGACTG 
GGAAAACAGGGTTGGCACCACTCTGCCACTCCGTTTGTGCCTGGGAAGGGCTAAGTATGC 
AAGGCTACAAACATCTACTTCACTGGGATCCCAAATGCTCAACAAACCATGACCTGCTNT 

25 GGTCAGAACCACCAGAAATATT 



AA234220 

GCAGGCGACTTGCGAGCTGGGAGCACTTTAAAACGCTTTGGATTCCCCCGGCCTGGGTGG 
GGAGAGCGAGCTGGGTGCCCCCTAGATTCCCCGCCCCCGCACCTCATGAGCCGACCCTCG 
30 GCTCCATGGAGCCTGGCATATTATGCCACCTTGGTATGGAGCCAAGGATATCGAAGGCTT 
GCTGGGAGCGGGAGGGGGGCGGAATCTGGTCGCCCACTCCCCTCTGACCAGCCACCCAGC 
GGCGCCTACGCTGATGCCTGCTGTCAACTATGCCCCCTTGGA 



AA236353 

35 GCCCGCTGGAGTCTGCAAATGCTGCCTTCCAAAAGGGACCTGGTGGGTTCTGTTCTCCCT 
GGCAACACATCTGGCTGTTCCAGCCACCAGCGAGACGCCAAGACTGGTAACTGTCCACAG 
GCAACAGGGAGTCATGTCGCGGTTCTCCAGGAGCACCCAGAGTCTGCACCACAGACACGT 
CCAGGTAACTGGCCATAGGTNGGTAGGTTCCCGGATATCCCGGATAGAAGGCAAACTCAG 
TGGGGCGGCTGGGGTACTCTTCCCCGGCCGTGGGAGTCTCCGCGGGGTACGCGCACAGGG 

40 TGGCTGCCTGGGCACAGGGTTTCAGCGAGCTCCGGGACACTCGGCAGGAGTAGTACCCGC 
CTCCAAAGTAACCATAAGGCA 

159 



PATENT 

Atty. Dkt. No. 022041001420 



AA588193 

AACTGCTCGCCACCGACCCCACTACTCGCCACCGACCCGCTGCTCGGAGCTTCGGTTCTG 
CGGGTTGTCCAGACTTCAGGCCTGTGCGCTCAATCGTGGAGAATGCGCCGGCAGCCCCCA 
5 CCCCCAGCCTAAGGTGCAGGAAGGACCAGCACGAACCCGCTGGCTTTGCTGCGCGGCCAG 
GAGATGAGTCCCACCGGGCACTGAGCCCAGGTACAGGACATCAGAGAATGAACACAGAGG 
CAGAGGCCCTCATGTCCCTCTCAGAGTCCCGGCTCTGCAAAGAGCCCGTCTGTCTCCAGC 
TTCCAGAATTCCGCACTGTGAATCTGTCTACGTGGACTGGGAAAACAGGGTTGGCACCAC 
TCTGCCACTCCGTTTGTGCCTGGGAAGGGCTAAGTATGCAAGGCT 

10 

AI821103 

GATCCCTTTGCAGGGAAGCTTTCTCTCAGACCCCCTTCCATTACACCTCTCACCCTGGTA 
ACAGCAGGAAGACTGAGGAGAGGGGAACGGGCAGATTCGTTGTGTGGCTGTGATGTCCGT 
TTAGCATTTTTCTCAGCTGACAGCTGGGTAGGTGGACAATTGTAGAGGCTGTCTCTTCCT 
1 5 CCCTCCTTGTCCACCCCATAGGGTGTACCCACTGGTCTTGGAAGCACCCATCCTTAATAC 
GATGATTTTTCTGTCGTGTGAAAATGAAGCCAGCAGGCTGCCCCTAGTCAGTCCTTCCTT 
CCAGAGAAAAAGAGATTTGAGAAAGTGA 

AI821851 

20 TTTTTTTTTTTTTTTTTTTTCTTTTTCACTTTCTCAAATCTCTTTTTCTCTGGAAGGAAG 
GACTGACTAGGGGCAGCCTGCTGGCTTCATTTTCACACGACAAAAAAATCATCGTATTAA 
GGATGGGTGCTTCCAAAACCAGTGGGTACACCCTATGGGGGGGACAAGGAGGGAGGAAGA 
GACAGCCTCTACAATTGTCCACCTACCCAGCTGTCAGCTGAGAAAAATGCTAAACGGACA 
TCACAGCCACACAACGAATCTGCCCGTTCCCCTCTCCTCAGTCTTCCTGCTGTTACCAGG 

25 GTGAGAGGTGTAATGGAAGG 

AA635855 

TTTTTTTTTTTTTTTTTTTTCTTTTTCACTTTCCCAAATCTCTTTTTCTCTGGAAGGAAG 
GACTGACTAGGGGCAGCCTGCTGGCTTCATTTTCACACGACAGAAAAATCATCGTATTAA 
30 GGATGGGTGCTTCCAAGACCAGTGGGTACACCCTATGGGGTGGACACAGGAGGGAGGAAG 
AGACAGCCTCTACAATTGTCCACCTACCCAGCTGTCAGCTGAGAAAAATGCTAAACGGAC 
ATCACAGCCACACAACGAATCTGCCCGTTCCCCTCTCCTCAGTCTTCCTGCTGTTACCAG 
GGTGAGAGGTGTAATGGAAGG 

35 AI420753 

GCGGCCGCGGCCCACCACCAACTGCTCGCCACCGACCCCACTACTCGCCACCGACCCGCT 
GCTCGGAGCTTCGGTTCTGCGGGTTGTCCAGACTTCAGGCCTGTGCGCTCAATCTTGGAG 
AATGCGCCGGCAGGCCCCCCACCCCCAGCCTAAGGTGCAGGAAGGACCAGCACGAACCCG 
CTGGCTTTGCTGCGCGGCCAGGAGATGAGTCCCACCGGGCACTGAGCCCAGGTACAGGAC 



PATENT 

Atty. Dkt. No. 022041001420 



ATCAGAGAATGAACACAGAGGCAGAGGCCCTCATGTCCCTCTCAGAGTCCCGGCTCTGCA 
AAGAGCCCGTCTGTCTCCAGCTTCCAGAATTCCGCACTGTGAATCTGTCTACGT 

BG 180547 

5 CACGCGTCGATCCCAGTGAAGTAGATGTTTGTAGCCTTGCATACTTAGTCCTTCCCAGGC 
ACAAACGGAGTGGCAGAGTGGTGCCAACCCTGTTTTCCCAGTCCACGTAGACAGATTCAC 
AGTGCGGAATTCTGGAAGCTGGAGACAGACGGGCTCTTTGCAGAGCCGGGACTCTGAGAG 
GGACATGAGGGCCTCTGCCTCTGTGTTCATTCTCTGATGTCCTGTACCTGGGCTCAGTGC 
CCGGTGGGACTCATCTCCTGGCCGCGCAGCAAAGCCAGCGGGTTCGTGCTGGTCCTTCCT 
10 GCACCTTAGGCTGGGGGTGGGGGGCCTGCCGGCGCATTCTCCACGATTGAGCGCACAGGC 
CTGAAGTCTGGACAACCCGCAGAACCGAAGCTCCGAGCAGCGGGTCGGTGGCGAGTAGTG 
GGGTCGGTGGCGAGCAGTTGGTGGTGGG 

AA468306 

1 5 TCGACCTCGCCAAGGTGAAGAACAACGCTACCCCTTAAGAGATCTCCTTGCCTGGGTGGG 
AGGAGCGAAAGTGGGGGTGTCCTGGGGAGACCAGGAACCTGCCAAGCCCAGGCTGGGGCC 
AAGGACTCTGCTGAGAGGCCCCTAGAGACAACACCCTTCCCAGGCCACTGGCTGCTGGAC 
TGTTCCTCAGGAGCGGCCTGGGTACCCAGTATGTGCAGGGAGA 

20 AA468232 

TTTTTTACTGGTTATCGTGGTTATTGCCACTGTCAGGATGAATGATTATGACTGGGCCAG 
GTTCTTTGGGAACCCTGGTGGAGTGGGCTGTCACATGGGGTTCCGTCTCCCTGCACATAC 
TGGGTACCCAGGCCGCTCCTGAGGAACAGTCCAGCAG 

25 CB050115 

GGCCCACCACCAACTGCTCGCCACCGACCCCACTACTCGCCACCGACCCGCTGCTCGGAG 
CTTCGGTTCTGCGGGTTGTCCAGACTTCAGGCCTGTGCGCTCAATCGTGGAGAATGCGCC 
GGCAGGCCCCCCACCCCCAGCCTAAGGTGCAGGAAGGACCAGCACGAACCCGCTGGCTTT 
GCTGCGCGGCCAGGAGATGAGTCCCACCGGGCACTGAGCCCAGGTACAGGACATCAGAGA 
30 ATGAACACAGAGGCAGAGGCCCTCATGTCCCTCTCAGAGTCCCGGCTCTGCAAAGAGCCC 
GTCTGTCTCCAGCTTCCAGAATTCCGCACTGTGAACCTCGTGCC 

CB050116 

GGCACGAGGTTCACAGTGCGGAATTCTGGAAGCTGGAGACAGACGGGCTCTTTGCAGAGC 
35 CGGGACTCTGAGAGGGACATGAGGGCCTCTGCCTCTGTGTTCATTCTCTGATGTCCTGTA 
CCTGGGCTCAGTGCCCGGTGGGACTCATCTCCTGGCCGCGCAGCAAAGCCAGCGGGTTCG 
TGCTGGTCCTTCCTGCACCTTAGGCTGGGGGTGGGGGGCCTGCCGGCGCATTCTCCACGA 
TTGAGCGCACAGGCCTGAAGTCTGGACAACCCGCAGAACCGAAGCTCCGAGCAGCGGGTC 
GGTGGCGAGTAGTGGGGTCGGTGGCGAGCAGTTGGTGGTGGGCC 
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AA661819 

GCTGCTCGGAGCTTCGGTTCTGCGGGTTGTCCAGACTTCAGGCCTGTGCGCTCAATCGTG 
GAGAATGCGCCGGCAGCCCCCACCCCCAGCCTAAGGTGCAGGAAGGACCAGCACGAACCC 
5 GCTGGCTTTGCTGCGCGGCCAGGAGATGAGTCCCACCGGCACTGAGCCAGGTACAGGACA 
TCAGAGAATGAACACAGAGGCAGAGGCCTCATGTCCCTCTCAGAGTCCCGGCTCTGCAAA 
GAGCCGTACTGTCTCCAGCTTCCAGAATTCCGCACTGTGAATCTGTCTACGTGGACTGGG 
AAAAC 

10 CF146837 

CACGAGGATTTTCTATCTAGAGCTCTGTAGAGCACTTTAGAAACCGCTTTCATGAATTGA 
GCTAATTATGAATAAATTTGGAAGGCGATCCCTTTGCAGGGAAGCTTTCTCTCAGACCCC 
CTTCCATTACACCTCTCACCCTGGTAACAGCAGGAAGACTGAGGAGAGGGGAACGGGCAG 
ATTCGTTGTGTGGCTGTGATGTCCGTTTAGCATTTTTCTCAGCTGACAGCTGGGTAGGTG 
GACAATTGTAGAGGCTGTCTCTTCCTCCCTCCTTGTCCACCCCATAGGGTGTACCCACTG 
GTCTTGGAAACACCCATCCTTAATACGATGATTTTTCTGTCGTGTGAAAATGAAGCCAGC 
AGGCTGCCCCTAGTCAGTCCTTCCTTCCAGAGAAAAAGAGATTTGAGAAAGTGCCTGGGT 
AATTCACCATTAATTTCCTCCCCCAAACTCTCTGAGTCTTCCCTTAATATTTCTGGTGGT 
TCTGACCAAAGCAGGTCATGGTTTGTTGAGCATTTGGGATCCCAGTGAAGTAGATGTTTG 
TAGCCTTGCATACTTAGCCCTTCCCAGGCACAAACGGAGTGGCAGAGTGGTGCCAACCCT 
GTTTTCCCAGTCCACGTAGACAGATTCACAGTGCGGAATTCTGGAAGCTGGAGACAGACG 
GGCTCTTTGCAGAGCCGGGACTCTGAG 

CF146763 

CACGAGGATTTTCTATNCTAGAGCTCTGGTAGAGCACTTTANAAACCGCTTTCATGAATT 
GAGCTAATTATGAATAAATTTGGAAGGCGATCCCTTTGCAGGGAAGCTTTCTCTCAGACC 
CCCTTCCATTACACCTCTCACCCTGGTAACAGCAGGAAGACTGAGGAGAGGGGAACGGGC 
AGATTCGTTGTGTGGCTGTGATGTCCGTTTAGCATTTTTCTCAGCTGACAGCTGGGTAGG 
TGGACAATTGTAGAGGCTGTCTCTTCCTCCCTCCTTGTCCACCCCATAGGGTGTACCCAC 
TGGTCTTGGAAACACCCATCCTTAATACGATGATTTTTCTGTCGTGTGAAAATGAAGCCA 
GCAGGCTGCCCCTAGTCAGTCCTTCCTTCCAGAGAAAAAGAGATTGAGAAAGTGCCTGGG 
TAATTCACCATTAATTTCCTCCCCCAAACTCTCTGAGTCTTCCCTTAATATTTCTGGTGG 
TTCTGACCAAAGCAGGTCATGGTTTGTTGAGCATTTGGGATCCCAGTGAAGTAGATGTTT 
GTAGCCTTGCATACTTAGCCCTTCCCAGGCACAAACGGAGTGGCAGAGTGGTGCCAACCC 
TGTTTTCCCAGTCCACGTAGACAGATTCACAGTGCGGAATTCTGGAAGCTGGAGACAGAC 
GGGCTCTTTGCAGAGCCGGGACTCTGA 

CF 144902 

CACGAGGGAAGCCAGCAGGCTGCCCCTAGTCAGTCCTTCCTTCCAGAGAAAAAGAGATTT 
40 GAGAAAGTGCCTGGGTAATTCACCATTAATTTCCTCCCCCAAACTCTCTGAGTCTTCCCT 
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TAATATTTCTGGTGGTTCTGACCAAAGCAGGTCATGGTTTGTTGAGCATTTGGGATCCCA 
GTGAAGTAGATGTTTGTAGCCTTGCATACTTAGCCCTTCCCAGGCACAAACGGAGTGGCA 
GAGTGGTGCCAACCCTGTTTTCCCAGTCCACGTAGACAGATTCACAGTGCGGAATTCTGG 
AAGCTGGAGACAGACGGGCTCTTTGCAGAGCCGGGACTCTGAGAGGGACATGAGGGCCTC 
5 TGCCTCTGTGTTCATTCTCTGATGTCCTGTACCTGGGCTCAGTGCCCGGTGGGACTCATC 
TCCTGGGCGCGCAGCAAAGCCAGCGGGTTCGTGCTGGTCCTTCCTGCACCTTA 

CF141511.1 

CACGAGGCCTGGTAACAGCAGGAAGACTGAGGAGAGGGGAACGGGCAGATTCGTTGTGTG 
10 GCTGTGATGTCCGTTTAGCATTTTTCTCAGCTGACAGCTGGGTAGGTGGACAATTGTAGA 
GGCTGTCTCTTCCTCCCTCCTTGTCCACCCCATAGGGTGTACCCACTGGTCTTGGAAACA 
CCCATCCTTAATACGATGATTTTTCTGTCGTGTGAAAATGAAGCCAGCAGGCTGCCCCTA 
GTCAGTCCTTCCTTCCAGAGAAAAAGAGATTTGAGAAAGTGCCTGGGTAATTCACCATTA 
ATTTCCTCCCCCAAACTCTCTGAGTCTTCCCTTAATATTTCTGGTGGTTCTGACCAAAGC 
1 5 AGGTCATGGTTTGTTGAGCATTTGGGATCCCAGTGAAGTAGATGTTTGTAGCCTTGCATA 
CTTAGCCCTTCCCAGGCACAAACGGAGTGGCAGAGTGGTGCCAACCCTGTTTTCCCAGTC 
CACGTAGACAGATTCACAGTGCGGAATTCTGGAA 

CF139563.1 

20 CACGAGGTCTTCCCTTAATATTTCTGGTGGTTCTGACCAAAGCAGGTCATGGTTTGTTGA 
GCATTTGGGATCCCAGTGAAGTAGATGTTTGTAGCCTTGCATACTTAGCCCTTCCCAGGC 
ACAAACGGAGTGGCAGAGTGGTGCCAACCCTGTTTTCCCAGTCCACGTAGACAGATTCAC 
AGTGCGGAATTCTGGAAGCTGGAGACAGACGGGCTCTTTGCAGAGCCGGGACTCTGAGAG 
GGACATGAGGGCCTCTGCCTCTGTGTTCATTCTCTGATGTCCTGTACCTGGGCTCAGTGC 

25 CCGGTGGGACTCATCTCCTGGCCGCGCAGCAAAGCCAGCGGGTTCGTGCTGGTCCTTCCT 
GCACCTTAGGCTGGGGGTGGGGGGCCTGCCGGCGCATTCTCCACGATTGAGCGCACAGGC 
CTGAAGTCTGGACAACCCGCAGAACCGAAGCTCCGAGCAGCGGGTCGGTGGCGAGTA 

CF1 39372 

30 CACGAGGATTTCTGGTGGTTCTGACCAAAGCAGGTCATGGTTTGTTGAGCATTTGGGATC 
CCAGTGAAGTAGATGTTTGTAGCCTTGCATACTTAGCCCTTCCCAGGCACAAACGGAGTG 
GCAGAGTGGTGCCAACCCTGTTTTCCCAGTCCACGTAGACAGATTCACAGTGCGGAATTC 
TGGAAGCTGGAGACAGACGGGCTCTTTGCAGAGCCGGGACTCTGAGAGGGACATGAGGGC 
CTCTGCCTCTGTGTTCATTCTCTGATGTCCTGTACCTGGGCTCAGTGCCCGGTGGGACTC 

35 ATCTCCTGGCCGCGCAGCAAAGCCAGCGGGTTCGTGCTGGTCCTTCCTGCACCTT 

CF139319 

CACGAGGAAGGCGATCCCTTTGCAGGGAAGCTTTCTCTCAGACCCCCTTCCATTACACCT 
CTCACCCTGGTAACAGCAGGAAGACTGAGGAGAGGGGAACGGGCAGATTCGTTGTGTGGC 
40 TGTGATGTCCGTTTAGCATTTTTCTCAGCTGACAGCTGGGTAGGTGGACAATTGTAGAGG 
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CTGTCTCTTCCTCCCTCCTTGTCCACCCCATAGGGTGTACCCACTGGTCTTGGAAACACC 
CATCCTTAATACGATGATTTTTCTGTCGTGTGAAAATGAAGCCAGCAGGCTGCCCCTAGT 
CAGTCCTTCCTTCCAGAGAAAAAGAGATTTGAGAAAGTGCCTGGGTAATTCACCATTAAT 
TTCCTCCCCCAAACTCTCTGAGTCTTCCCTTAATATTTCTGGTGGTTCTGACCAAAGCAG 
5 GTCATGGTTTGTTGAGCATTTGGGATCCCAGTGAAGTAGATGTTTGTAGCCTTGCATACT 
TAGCCCTTCC 

CF1 39275 

CACGAGGTGGATTCCCCCGGCCTGGGTGGGGAGAGCGAGCTGGGTGCCCCCTAGATTCCC 
10 CGCCCCCGCACCTCATGAGCCGACCCTCGGCTCCATGGAGCCCGGCAATTATGCCACCTT 
GGATGGAGCCAAGGATATCGAAGGCTTGCTGGGAGCGGGAGGGGGGCGGAATCTGGTCGC 
CCACTCCCCTCTGAGCAGCCACCCAGCGGCGCCTACGCTGATGCCTGCTGTCAACTATGC 
CCCCTTGGATCTGCCAGGCTCGGCGGAGCCGCCAAAGCAATGCCACCCATGCCCTGGGGT 
GCCCCAGGGGACGTCCCCAGCTCCCGTGCCTTATGGTTACTTTGGAGGCGGGTACTACTC 
1 5 CTGCCGAGTGTCGCGGAGCTCGCTGAAACCCTGTGCCCAGGCA 

CF1 22893 

CACGAGGATTTTCTATCTAGAGCTCTGTAGAGCACTTTAGAAACCGCTTTCATGAATTGA 
GCTAATTATGAATAAATTTGGAAGGCGATCCCTTTGCAGGGAAGCTTTCTCTCAGACCCC 

20 CTTCCATTACACCTCTCACCCTGGTAACAGCAGGAAGACTGAGGAGAGGGGAACGGGCAG 
ATTCGTTGTGTGGCTGTGATGTCCGTTTAGCATTTTTCTCAGCTGACAGCTGGGTAGGTG 
GACAATTGTAGAGGCTGTCTCTTCCTCCCTCCTTGTCCACCCCATAGGGTGTACCCACTG 
GTCTTGGAAACACCCATCCTTAATACGATGATTTTTCTGTCGTGTGAAAATGAAGCCAGC 
AGGCTGCCCCTAGTCAGTCCTTCCTTCCAGAGAAAAAGAGATTTGAGAAAGTGCCTGGGT 

25 AATTCACCATTAATTTCCTCCCCCAAACTCTCTGAGTCTTCCCTTAATATTTCTGGTGGT 
TCTGACCAAAGCAGGTCATGGTTTGTTGAGCATTTGGGATCCCAGTGAAGTANATGTTTG 
TAGCCTTGCATACTTAGCCCTT 

AI972423 

30 CATTTTCACACGACTGTAAAATCATCGTATTAAGGATGGGTGCTTCCAAGACCAGTGGGT 
ACACCCTATGGGGTGGACAAGGAGGGAGGAAGAGACAGCCTCTACAATTGTCCACCTACC 
CAGCTGTCAGCTGAGAAAAATGCTAAACGGACATCACAGCCACACAACGAATCTGCCCGT 
TCCCCTCTCCTCAGTCTTCCTGCTGTTACCAGGGTGAGAGGTGTAATGGAAGGGGGTCTG 
AGAGAAAGCTTCCCTGCAAAGGGATCGCCTTCCAAATTTATTCATAATTAGCTCAATTCA 

35 TGAAAGCGGTTTCTAAAGTGCTCTACAGAGCTCTAGATAGAAAATATGAGGCTAACGATC 
ATGGCAGCTAGTACTGGTTATCGTGATTATTGCCACTGTCAGGATGAATGATTATGACTG 
GGCCAGGTTCTTTGGGAACCCTGGTGGAGTGGGCTGTCACATG 

AI918975 

40 TGCAGCTAGTACTGGTTATCGTGATTATTGCCACTGTCAGGATGAATGATTATGACTGGG 
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CCAGGTTCTTTGGGAACCCTGGTGGAGTGGGCTGTCACATGGGGTTCCGTCTCCCTGCAC 
ATACTGGGTACCCAGGCCGCTCCTGAGGAACAGTCCAGCACAGGGTTTCAGCGAGCTCCG 
GGACACTCGGCCTCGTGC 

5 AI826991 

TTTTTTTTTTTTTTTTTTTTCTTTTTCACTTTCTCAAATCTCTTTTTCTCTGGAAGGAAG 
GACTGACTAGGGGCAGCCTGCTGGCTTCATTTTCACACCACAAAAAAATCATCGTATTAA 
GGATGGGTGCTTCCAAAACCAGTGGGTACACCCTATGGGGTGGACAAGGAGGGAGGAAAA 
AACAGCCTCTACAATTGTCCACCTACCCAGCTGTCAGCTGAAAAAAATGCTAAACGGACA 
10 TCACAGCCACACAACGAATCTGCCCGTTCCCCTCTCCTCAGTCTTCCTGCTGTTACCAGG 
GTGAAAGGTGTAATGGAAGG 

AI686312 

ACCGACCCCACTACTTGCCACCGACCCGCTGCTCGGAGCTTCGGTTCTGCGGGTTGTCCA 
GACTTCAGGCCTGTGCGCTCAATCGTGGAGAATGCGCCGGCAGGCCCCCCACCCCCAGCC 
TAAGGTGCAGGAAGGACCAGCACGAACCCGCTGGCTTTGCTGCGCGGCCAGGAGATGAGT 
CCCACCGGGCACTGAGCCCAGGTACAGGACATCAGAGAATGAACACAGAGGCAGAGGCCC 
TCATGTCCCTCTCAGAGTCCCGGCTCTGCAAAGAGCCCGTCTGTCTCCAGCTTCCAGAAT 
TCCGCACTGTGAATCTGTCTACGTGGACTGGGAAAACAGGGTTGGCACCACTCTGCCACT 
CCGTTTGTGCCTGGGAAGGGCTAAGTATGCAAGGCTACAAACATCTACTTCACTGGGATC 
C 

AI655923 

TTTTTTTTTTTTTTTCCCTGCAAAGGGATCGCCTTCCAAATTTATTCATAATTAGCTCAA 
25 TTCATGAAAGCGGTTTCTAAAGTGCTCTACAGAGCTCTAGATAGAAAATATGAGGCTAAC 
GATCATGGCAGCTAGTACTGGTTATCGTGATTATTGCCACTGTCAGGATGAATGATTATG 
ACTGGGCCAGGTTCTTTGGGAACCCTGGTGGAGTGGGCTGTCACATGGGGTTCCGTCTCC 
CTGCACATACTGGGTACCCAGGCCGCTCCTGA 

30 CF 146922 

CACGAGGCGACTTGCGAGCTGGGAGCGATTTAAAACGCTTTGGATTCCCCGGCCTGGGTG 
GGGAGAGCGAGCTGGGTGCCCCCTAGATTCCCCGCCCCCGCACCTCATGAGCCGACCCTC 
GGCTCCATGGAGCCCGGCAATTATGCCACCTTGGATGGAGCCAAGGATATCGAAGGCTTG 
CTGGGAGCGGGAGGGGGGCGGAATCTGGTCGCCCACTCCCCTCTGACCAGCCACCCAGCG 

35 GCGCCTACGCTGATGCCTGCTGTCAACTATGCCCCCTTGGATCTGCCAGGCTCGGCGGAG 
CCGCCAAAGCAATGCCACCCATGCCCTGGGGTGCCCCAGGGGACGTCCCCAGCTCCCGTG 
CCTTATGGTTACTTTGGAGGCGGGTACTACTCCTGCCGAGTGTCCCGGAGCTCGCTGAAA 
CCCTGTGCCCAGGCAGCCACCCTGGCCGCGTACCCCGCGGAGACTCCCACGGCCGGGGAA 
GAGTACCCCAGCCGCCCCACTGAGTTTGCCTTCTATCCGGGATATCCGGGAACCTACCAG 

40 CCTATGGCCAGTTACCTGGACGTGTCTGTGGTGCAGACTCTGGGTGCTCCTGGAGAACGC 
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GACATGACTCCCTGTTGCCTGTGGACAGTTACCAGTCTTGGGCTCTCGCTGGTGGCTGGA 
ACAGCCAGATGTGTTGCCA 

BF476369 

5 GCGGCCGCGGCCCACCACCAACTGCTCGCCATTCGACCCCACTACTCGCCACCGACCCGC 
TGCTCGGAGCTTCGGTTCTGCGGGTTGTCCAGACTTCAGGCCTGTGCGCTCAATCGTGGA 
GAATGCGCCGGCAGGCCCCCCACCCCCAGCCTAAGGTGCAGGAAGGACCAGCACGAACCC 
GCTGGCTTTGCTGCGCGGCCAGGAGATGAGTCCCACCGGGCACTGAGCCCAGGTACAGGA 
CATCAGAGAATGAACACAGAGGCAGAGGCCCTCATGTCCCTCTCAGAGTCCCGGCTCTGC 
1 0 AAAGAGCCCGTCTGTCTCCAGCTTCCAGAATTCCGCACTGTGAATCTGTCTACGTGGACT 
GGGAAAACAGGGTTGGCACCACTCTGCCACTCC 

BF057410 

GCGGCCGCGGCCCACCACCAACTGCTCGCCACCGACCCCACTACTCGCCACCGACCCGCT 
1 5 GCTCGGAGCTTCGGTTCTGCGGGTTGTCCAGACTTCAGGCCTGTGCGCTCAATCGTGGAG 
AATGCGCCGGCAGGCCCCCCACCCCCAGCCTAAGGTGCAGGAAGGACCAGCACGAACCCG 
CTGGCTTTGCTGCGCGGCCAGGAGATGAGTCCCACCGGGCACTGAGCCCAGGTACAGGAC 
ATCAGAGAATGAACACAGAGGCAGAGGCCCTCATGTCCCTCTCAGAGTCCCGGCTCTGCA 
AAGAGCCCGTCTGTCTCCAGCTTCCAGAATTCCGCACTGTGAATCTGTCTACGTGGACTG 
20 GGAAAACAGGGTTGGCACCACTCTGCCACTCCGTTTGTGCCTGGGAAGGGCTAAGTATGC 
AAGGCTACAAACATCTACTTCACTGGGATCCCAAATGCTCAACAAACCATGACCTGCTNT 
GGTCAGAACCACCAGAAATATTAA 

BE645544 

25 GCGGCCGCGGCCCACCACCAACTGCTCGCCACCGACCCCACTACTCGCCACCGACCCGCT 
GCTCGGAGCTTCGGTTCTGCGGGTTGTCCAGACTTCAGGCCTGTGCGCTCAATCGTGGAG 
AATGCGCCGGCAGGCCCCCCACCCCCAGCCTAAGGTGCAGGAAGGACCAGCACGAACCCG 
CTGGCTTTGCTGCGCGGCCAGGAGATGAGTCCCACCGGGCACTGAGCCCAGGTACAGGAC 
ATCAGAGAATGAACACAGAGGCAGAGGCCCTCATGTCCCTCTCAGAGTCCCGGCTCTGCA 

30 AAGAGCCCGTCTGTCTCCAGCTTCCAGAATTCCGCACTGTGAATCTGTCTACGTGGACTG 
GGAAAACAGGGTTGGCACCACTCTGCCACTCCGTTTGTGCCTGGGAAGGGCTAAGTATGC 
AAGGCTACAAACATCTACTTCACTGGGATCC 

BE645408 

35 TCCTCCCTCTAAGAAAGGCGCAAGCGTCAAGAGGGTGCTGCCCGCTGGTTTCTGCAAATG 
CTGCCTTCCAAAAAGGACCTGGTGGGTTCTGTTCTCCCTGGCAACACATCTGGCTGTTCC 
AGCCACCAGCGAGAGCCCAAGACTGGTAACTGTCCACAGGCAACAGGGAGTCATGTCGCG 
GTTCTCCAGGAGCACCCAGAGTCTGCACCACAGACACGT 

40 BE388501 
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TTAATACGATGATTTTTCTGTCGTGTGAAAATGAAGCCAGCAGGCTGCCCCTAGTCAGTC 
CTTCCTTCCAGAGAAAAAGAGATTTGAGAAAGTGCCTGGGTAATTCACCATTAATTTCCT 
CCCCCAAACTCTCTGAGTCTTCCCTTAATATTTCTGGTGGTTCTGACCAAAGCAGGTCAT 
GGTTTGTTGAGCATTTGGGATCCCAGTGAAGTAGATGTTTGTAGCCTTGCATACTTAGCC 
5 CTTCCCAGGCACAAACGGAGTGGCAGAGTGGTGCCAACCCTGTTTTCCCAGTCCACGTAG 
ACAGATTCACAGTGCGGAATTCTGGAAGCTGGAGACAGACGGGCTCTTTGCAGAGCCGGG 
ACTCTGAGAGGGACATGAGGGCCTCTGCCTCTGTGTTCATTCTCTGATGTCCTGTACCTG 
GGCTCAGTGCCCGGTGGGACTCATCTCCTGGCCGCGCAGCAAAGCCAGCGGGTTCGTGCT 
GGTCCTTCCTGCACCTTAGGCTGGGGGTGGGGGGCCTGCCGGCGCATTCTCCACGATTGA 
1 0 GCGCACAGGCCTGAAGTCTGGACAACCCGCAGAACCGAAGCTCCGAGCAGCGGGTCGGTG 
GCGAGTAGTGGGGGTCGGTGGCGAACAAGTGGTGGTGGGCCGGGGCCGCATAACTCGAGG 
ACTTTCCTCCCGGAGCAGTCCCTAAAAACCCGGGGGCGC 

CF147366 

1 5 GACGAGGACAATTGTAGAGGCTGTCTCTTCCTCCCTCCTTGTCACCCCATAGGGTGTACC 
ACTGGTCTTGGAAGCACCCATCCTTAATACGATGATTTTTCTGTCGTGTGAAAATGAAGC 
CAGCAGGCTGCCCCTAGTCAGTCCTTCCTTCCAGAGAAAAAGAGATTTGAGAAAGTGCCT 
GGGTAATTCACCATTAATTTCCTCCCCCAAACTCTCTGAGTCTTCCCTTAATATTTCTGG 
TGGTTCTGACCAAAGCAGGTCATGGTTTGTTGAGCATTTGGGATCCCAGTGAAGTAGATG 

20 TTTGTAGCCTTGCATACTTAGCCCTTCCCAGGCACAAACGGAGTGGCAGAGTGGTGCCAA 
CCCTGTTTTCCCAGTCCACGTAGACAGATTCACAGTGCGGAATTCTGGAAGCTGGAGACA 
GACGGGCTCTTTGCAGAGCCGGGACTCTGAGAGGGACATGAGGGCCTCTGCCTCTGTGTT 
CATTCTCTGATGTCCTGTACCTGGGCTCAGTGCCCGGTGGGACTCATCTCCTGGCCGCGC 
AGCAAAGCCAGCGGGTTCGTGCTGGTCCTTCCTGC 

25 

CF147143 

CACGAGGCGACTTGCGAGCTGGGAGCGATTTAAAACGCTTTGGATTCCCCCGGCCTGGGT 
GGGGAGAGCGAGCTGGGTGCCCCCTAGATTCCCCGCCCCCGCACCTCATGAGCCGACCCT 
CGGCTCCATGGAGCCCGGCAATTATGCCACCTTGGATGGAGCCAAGGATATCGAAGGCTT 

30 GCTGGGAGCGGGAGGGGGGCGGAATCTGGTCGCCCACTCCCCTCTGACCAGCCACCCAGC 
GGCGCCTACGCTGATGCCTGCTGTCAACTATGCCCCCTTGGATCTGCCAGGCTCGGCGGA 
GCCGCCAAAGCAATGCCACCCATGCCCTGGGGTGCCCCAGGGGACGTCCCCAGCTCCCGT 
GCCTTATGGTTACTTTGGAGGCGGGTACTACTCCTGCCGAGTGTCCCGGAGCTCGCTGAA 
ACCCTGTGCCCAGGCAGCCACCCTGGCCGCGTACCCCGCGGAGACTCCCACGGCCGGGGA 

35 AGAGTACCCAGCCGCCCCACTGAGTTTGCCTTCTATCCGGGATATCCGGGAACCTACCAG 
CCTATGGCCAGTTACCTGGACGTGTCTGTGGTGCAGACTCTGGGTGCTCCTGGAGAACGC 
GACATGACTCCCTGTTGCCTGTGGACAGTTACCAATCTTGGGCTCTCGCTGGTGGCTGGA 
ACAGCCAGATGTGTTGCCAGGGAG 

40 BT007410 

atggagcccg gcaattatgc caccttggat ggagccaagg atatcgaagg cttgctggga 
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gcgggagggg ggcggaatct ggtcgcccac tcccctctga ccagccaccc agcggcgcct 
acgctgatgc ctgctgtcaa ctatgccccc ttggatctgc cacjgctcggc ggagccgcca 
aagcaatgcc acccatgccc tggggtgccc caggggacgt ccccagctcc cgtgccttat 
ggttactttg gaggcgggta ctactcctgc cgagtgtccc ggagctcgct gaaaccctgt 
5 gcccaggcag ccaccctggc cgcgtacccc gcggagactc ccacggccgg ggaagagtac 
cccagccgcc ccactgagtt tgccttctat ccgggatatc cgggaaccta ccagcctatg 
gccagttacc tggacgtgtc tgtggtgcag actctgggtg ctcctggaga accgcgacat 
gactccctgt tgcctgtgga cagttaccag tcttgggctc tcgctggtgg ctggaacagc 
cagatgtgtt gccagggaga acagaaccca ccaggtccct tttggaaggc agcatttgca 
10 gactccagcg ggcagcaccc tcctgacgcc tgcgcctttc gtcgcggccg caagaaacgc 
attccgtaca gcaaggggca gttgcgggag ctggagcggg agtatgcggc taacaagttc 
atcaccaagg acaagaggcg caagatctcg gcagccacca gcctctcgga gcgccagatt 
accatctggt ttcagaaccg ccgggtcaaa gagaagaagg ttctcgccaa ggtgaagaac 
agcgctaccc cttag 

15 

BC007092 

ggattccccc ggcctgggtg gggagagcga gctgggtgcc ccctagattc cccgcccccg 
cacctcatga gccgaccctc ggctccatgg agcccggcaa ttatgccacc ttggatggag 
ccaaggatat cgaaggcttg ctgggagcgg gaggggggcg gaatctggtc gcccactccc 

20 ctctgaccag ccacccagcg gcgcctacgc tgatgcctgc tgtcaactat gcccccttgg 
atctgccagg ctcggcggag ccgccaaagc aatgccaccc atgccctggg gtgccccagg 
ggacgtcccc agctcccgtg ccttatggtt actttggagg cgggtactac tcctgccgag 
tgtcccggag ctcgctgaaa ccctgtgccc aggcagccac cctggccgcg taccccgcgg 
agactcccac ggccggggaa gagtacccca gccgccccac tgagtttgcc ttctatccgg 

25 gatatccggg aacctaccag cctatggcca gttacctgga cgtgtctgtg gtgcagactc 
tgggtgctcc tggagaaccg cgacatgact ccctgttgcc tgtggacagt taccagtctt 
gggctctcgc tggtggctgg aacagccaga tgtgttgcca gggagaacag aacccaccag 
gtcccttttg gaaggcagca tttgcagact ccagcgggca gcaccctcct gacgcctgcg 
cctttcgtcg cggccgcaag aaacgcattc cgtacagcaa ggggcagttg cgggagctgg 

30 agcgggagta tgcggctaac aagttcatca ccaaggacaa gaggcgcaag atctcggcag 
ccaccagcct ctcggagcgc cagattacca tctggtttca gaaccgccgg gtcaaagaga 
agaaggttct cgccaaggtg aagaacagcg ctacccctta agagatctcc ttgcctgggt 
gggaggagcg aaagtggggg tgtcctgggg agaccaggaa cctgccaagc ccaggctggg 
gccaaggact ctgctgagag gcccctagag acaacaccct tcccaggcca ctggctgctg 

35 gactgttcct caggagcggc ctgggtaccc agtatgtgca gggagacgga accccatgtg 
acagcccact ccaccagggt tcccaaagaa cctggcccag tcataatcat tcatcctgac 
agtggcaata atcacgataa ccagtactag ctgccatgat cgttagcctc atattttcta 
tctagagctc tgtagagcac tttagaaacc gctttcatga attgagctaa ttatgaataa 
atttggaaaa aaaaaaaaaa aaaaaaaaaa aaaaaa 

40 

U57052 

cgggtgcccc ctagattccc cgcccccgca cctcatgagc cgaccctcgg ctccatggag 
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cccggcaatt atgccacctt ggatggagcc aaggatatcg aaggcttget gggagcggga 
9999gg c 9g a atetggtege ccactcccct ctgaccagcc acccagcggc gcctacgctg 
atgcctgctg teaactatge ccccttggat ctgccaggct cggcggagcc gecaaagcaa 
tgccacccat gccctggggt gccccagggg acgtccccag ctcccgtgcc ttatggttac 
5 tttggaggcg ggtactactc ctgccgagtg tcccggagct cgctgaaacc ctgtgcccag 
gcagccaccc tggccgcgta ccccgcggag actcccacgg ceggggaaga gtaccccagc 
cgccccactg agtttgcctt etatceggga tatcegggaa cctaccacgc tatggccagt 
tacctggacg tgtctgtggt gcagactctg ggtgctcctg gagaaccgeg acatgactcc 
ctgttgcctg tggacagtta ccagtcttgg gctctcgctg gtggctggaa cagecagatg 

10 tgttgccagg gagaacagaa cccaccaggt cccttttgga aggcagcatt tgcagactcc 
agegggcage accctcctga cgcctccgcc tttcgtcgcg geegcaagaa acgcattccg 
tacagcaagg ggcagttgcg ggagctggag egggagtatg eggctaacaa gttcatcacc 
aaggacaaga ggegcaagat ctcggcagcc accagcctct cggagcgcca gattaccatc 
tggtttcaga accgccgggt caaagagaag aaggttctcg ccaaggtgaa gaacageget 

15 accccttaag agatctcctt gcctgggtgg gaggagegaa agtgggggtg tcctggggag 
accaggaacc tgccaagccc aggctggggc caaggactct gctgagaggc ccctagagac 
aacacc 



U81599 

20 tcctaatacg actcactata gggctcgagc 
ttgcgagctg ggagcgattt aaaacgcttt 
gctgggtgcc ccctagattc cccgcccccg 
agcccggcaa ttatgccacc ttggatggag 
gaggggggcg gaatctggtc gcccactccc 

25 tgatgcctgc tgtcaactat gcccccttgg 
aatgccaccc atgccctggg gtgccccagg 
actttggagg egggtactae tcctgccgag 
aggcagccac cctggccgcg taccccgcgg 
gtcgccccac tgagtttgcc ttctatcegg 

30 gttacctgga cgtgtctgtg gtgeagaetc 
ccctgttgcc tgtggacagt taccagtctt 
tgtgttgcca gggagaacag aacccaccag 
ccagcgggca gcaccctcct gacgcctgcg 
cgtacagcaa ggggcagttg egggagctgg 

35 ccaaggacaa gaggegcaag ateteggcag 
tctggtttca gaaccgccgg gtcaaagaga 
ctacccctta agagatctcc ttgcctgggt 
agaccagaaa cctgccaagc ccaggctggg 
acaacaccct tcccaggcca ctggctgctg 

40 agtatgtgca gggagacgga accccatgtg 
acctggccca gtcataatca ttcatcctca 



ggccgcccgg gcaggtcgaa tgeaggegae 
ggattccccc ggcctgggtg gggagagega 
cacctcatga gccgaccctc ggctccatgg 
ccaaggatat cgaaggcttg ctgggagcgg 
ctctgaccag ccacccagcg gcgcctacgc 
atetgecagg cteggeggag ccgccaaagc 
ggacgtcccc agctcccgtg ccttatggtt 
tgtcceggag ctegctgaaa ccctgtgccc 
agactcccac ggccggggaa gagtacccca 
gatatceggg aacctaccac getatggeca 
tgggtgctcc tggagaaccg cgacatgact 
gggctctcgc tggtggctgg aacagecaga 
gtcccttttg gaaggcagca tttgeagact 
cctttcgtcg cggccgcaag aaaegcatte 
agegggagta tgeggctaac aagttcatca 
ccaccagcct ctcggagcgc cagattacca 
agaaggttct cgccaaggtg aagaacagcg 
gggaggagcg aaagtggggg tgtcctgggg 
gecaaggact ctgetgagag gcccctagag 
gactgttcct caggagegge ctgggtaccc 
acaggcccac tccaccaggg ttcccaaaga 
cagtggcaat aatcacgata accagt 
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ATTTTTCTGTCGTGTGAAAATGAAGCCAGCAGGCTGCCCCTAGTCAGTCCTTCCTTCCAG 
AGAAAAAGAGATTTGAGAAAGTGCCTGGGTAATTCACCATTAATTTCCTCCCCCAAACTC 
TCTGAGTCTTCCCTTAATATTTCTGGTGGTTCTGACCAAAGCAGGTCATGGTTTGTTGAG 
CATTTGGGATCCCAGTGAAGTAGATGTTTGTAGCCTTGCATACTTAGCCCTTCCCAGGCA 
5 CAAACGGAGTGGCAGAGTGGTGCCAACCCTGTTTTCCCAGTCCACGTAGACAGATTCACA 
GTGCGGAATTCTGGAAGCTGGAGACAGACGGGCTCTTTGCAGAGCCGGGACTCTGAGAGG 
GACATGAGGGCCTCTGCCTCTGTGTTCATTCTCTGATGTCCTGTACCTGGGCTCAGTGCC 
CGGTGGGACTCATCTCCTGGCTGCGCAGCAAAGCCAGCGGGTTCGTGCTGGTCCTTCCTG 
CACCTTAGGCTGGGGGTGGGGGGCCT 

CB1 25764 

ATTTTTCTGTCGTGTGAAAATGAAGCCAGCAGGCTGCCCCTAGTCAGTCCTTCCTTCCAG 
AGAAAAAGAGATTTGAGAAAGTGCCTGGGTAATTCACCATTAATTTCCTCCCCCAAACTC 
TCTGAGTCTTCCCTTAATATTTCTGGTGGTTCTGACCAAAGCAGGTCATGGTTTGTTGAG 
CATTTGGGATCCCAGTGAAGTAGATGTTTGTAGCCTTGCATACTTAGCCCTTCCCAGGCA 
CAAACGGAGTGGCAGAGTGGTGCCAACCCTGTTTTCCCAGTCCACGTAGACAGATTCACA 
GTGCGGAATTCTGGAAGCTGGAGACAGACGGGCTCTTTGCAGAGCCGGGACTCTGAGAGG 
GACATGAGGGCCTCTGCCTCTGTGTTCATTCTCTGATGTCCTGTACCTGGGCTCAGTGCC 
CGGTGGGACTCATCTCCTGGCTGCGCAGCAAAGCCAGCGGGTTCGTGCTGGTCCTTCCTG 
CACCTTAGGCTGGGGGTGGGGGGGGCCTGCCGGCGCATTCTCCACGATTGAGCGCACAGG 
C CTGAAGTCTGGACAAC C CGCAGAAC CGAAGCTC CGAGCAGCGGGTCGGTGGCGAGT 



AU098628 

ATTTAAAACGCTTTGGATTCTTTCGTCCTGCGTGGGGAGAGCGAGCTGGGTGCCCCCTAG 
25 ATTCCCCGCCCCCGCACCTCATGAGCCGACCCTCGGCTCCATGGAGCCCGGCACTTATGC 
CACCTTGGATGGAGCCAAGGATATCGAAGGCTTGCTGGGAGCGGGAGGGGGGCGGAATCT 
GGTCGCCCACTCCCCTCTGACCAGCCACCCAGCGGCGCCTACGCTGATGCCTGCTGTCAA 
TTATGCCCCCTTGCATCTGCCAGGCTCGGCGGAGCCGCCAAAGCAATGCCACCCATGCCC 



30 CB126130 

ATTTTTCTGTCGTGTGAAAATGAAGCCAGCAGGCTGCCCCTAGTCAGTCCTTCCTTCCAG 
AGAAAAAGAGATTTGAGAAAGTGCCTGGGTAATTCACCATTAATTTCCTCCCCCAAACTC 
TCTGAGTCTTCCCTTAATATTTCTGGTGGTTCTGACCAAAGCAGGTCATGGTTTGTTGAG 
CATTTGGGATCCCAGTGAAGTAGATGTTTGTAGCCTTGCATACTTAGCCCTTCCCAGGCA 
35 CAAACGGAGTGGCAGAGTGGTGCCAACCCTGTTTTCCCAGTCCACGTAGACAGATTCACA 
GTGCGGAATTCTGGAAGCTGGAGACAGACGGGCTCTTTGCAGAGCCGGGACTCTGAGAGG 
GACATGAGGGCCTCTGCCTCTGTGTTCATTCTCTGATGTCCTGTACCTGGGCTCAGTGCC 
CGGTGGGACTCATCTCCTGGCTGCGCAGCAAAGCCAGCGGGTTCGTGCTGGTCCTTCCTG 
CACCTTAGGCTGGGGGTGGGGGGCCTGC 

40 
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AGGCCGCACCCAGTCTTAAGGTGCAGTGAAGGACAGCACGAACCCGCTGTGCTTTGCTGC 
GCGGCAGGAGATGAGTCCCACCGGGCACTGAGCCCAGGTACAGGACATCAGAGAATGAAC 
ACAGAGGCAGAGGCCCTCATGTCCCTCTCAGAGTCCCGGCTCTGCAAAGAGCCCGTCTGT 
CTCCAGCTTCCAGAATTCCGCACTGTGAATCTGTCTACGTGGACTGNGAAAACAGGGTTG 
5 GCACCACTCTGCCACTCCGTTTGTGCCTNGGGGCGGGCAGAGGG 

BM767063.1 

AAAAACGCTTTGGATTCCCCCGGCCTGGGTGGGGAGAGCGAGCTGGGTGCCCCCTAGATT 
CCCCGCCCCCGCACCTCATGAGCCGACCCTCGGCTCCATGGAGCCCGGCAATTATGCCAC 

1 0 CTTGGATGGAGCCAAGGATATCGAAGGCTTGCTGGGAGCGGGAGGGGGGCGGAATCTGGT 
CGCCCACTCCCCTCTGACCAGCCACCCAGCGGCGCCTACGCTGATGCCTGCTGTCAACTA 
TGCCCCCTTGGATCTGCCAGGCTCGGCGGAGCCGCCAAAGCAATGCCACCCATGCCCTGG 
GGTGCCCCAGGGGACGTCCCCAGCTCCCGTGCCTTATGGTTACTTTGGAGGCGGGTACTA 
CTCCTGCCGAGTGTCCCGGAGCTCGCTGAAACCCTGTGCCCAGGCAGCCACCCTGGCCGC 

1 5 GTACCCCGCGGAGACTCCCACGGCCGGGGAAGAGTACCCCAGCCGCCCCACTGAGTTTGC 
CTTCTATCCGGGATATCCGGGAACCTACCAGCCTATGGCCAGTTACCTGGACGTGTCTGT 
GGTGCAGACTCTGGGTGCTCCTGGAGAACCGCGACATGACTCCCTGTTGCCTGTGGACAG 
TTACCAGTCTTGGGCTCTCGCTGGTGGCTGGAACAGCCAGATGTGTTGCCA 

20 BM794275 

GCAGACTCTGGGTGCTCCTGGAGAACCGCGACGTGACTCCCTGTTGCCTGTGGACAGTTA 
CCACTCTTGGGCTCTCGCTGGTGGCTGGAACAGCCAGATGTGTTGCCAGGGAGAACAGAA 
CCCACCAGGTCCCTTTTGGAAGGCAGCATTTGCAGACTCCAGCGGGCAGCACCCTCCTGA 
CGCCTGCGCCTTTCGTCGCGGCCGCAAGAAACGCATTCCGTACAGCAAGGGGCAGTTGCG 
25 GGAGCTGGAGCGGGAGTATGCGGCTAACAAGTTCATCACCAAGGACAAGAGGCGCAAGAT 
CTCGGCAGCCACCAGCCTCTCGGAGCGCCAGATTACCATCTGGTTTCAGAACCGCCGGGT 
CAAAGAGAAGAAGGTTCTCGCCAAGGTGAAGAACAGCGCTACCCCTTAAGAGATCTCCTT 
GCCTGGGTGGGAGGATCTAAA.GTGGGGGTGTCCTGGGGAGACCAGGAACCTGCCAAGCCC 
AGGCTGGGGCCAAGGACT 

30 

BQ363211 

ACGCTGCACTGCGTTTCAAAGAGAAGAAGGTTCTCGCCAAGGTGAAGAACAGCGCTACCC 
CTTAAGAGATCTCCTTGCTTGGGTGGGAGGAGCGAAAGTGGGGGTGTCCTGGGGAGACCA 
GGAACCTGCCATCACCAGGCTGGGCCCAAGGACTCTGCTGAGAGGCCCCTAGAGACAACA 
35 CCCTTCCCAGGCCATTGCTTGCTGGACTGTGCCTCAGGAGCGGCCTGGGTACC 

BM932052 

GAGTTTTCCAATTTCCAAAGAAAAATTTAGGTTTCCTGCAGCCGTGACATATGTGTGTGC 
ACTGGGATGGGTTAATGTGTGTGTGTGTGTGTGTATGCGCATGTATTGGGAGTGGGGGCA 
40 GAAACGTGTTTCCAGAATTTGCCTGTAGAATCTAAAAGAGTGGCCAAGAGTCTGGAAATG 
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CATGAAGACTGGACGTATGTGATGGTGGGCAAAGGCCTGACTGTGTGTGGTGTGTGGGTA 
TGTTTGCAGATTCGCGGGTGTGAGAGCAGTGATGGGTGAGGGTGGCCTTCAGGAGCCAAG 
GCTGATCGGTGGTGAGAGAACAAGCCGGAAGCCAGGGTGCTGTCCTGGTATGCTTTGGAG 
GAACAGGATTGCACGTGCGCCTGTAGGGTGACCTGTGTGCACCTGTGAGATGACTTAGCT 
5 TGGGGCTTGCAAGGCCTGGGTCTGCATGGGTGGGTATCTGACCATGCCTTTTCCTCCCTC 
CCTTTCACGCCGCGCAGACTCCAGCGGGCAGCACCCTCCTGACGCCTGCGCCTTTCGTC 

AA357646.1 

CCGGCCTGGGTGGGGAGAGCGAGCTGGGTGCCCCCTAGATTCCCCGCCCCCGCACCTCAT 
1 0 GAGCCGACCCTCGGCTCCATGGAGCCCGGCAATTATGCCACCTTGGATGGAGCCAAGGAT 
ATCGAAGGCTTGCTGGGAGCGGGAGGGGGGCGGAATCTGGTCGCCCACTCCCCTCTGACC 
AGCCACCCAGCGGCGCCTACGCTTGATGCCTGCTTGTCAACTATGCCCCCTTGGATCTGC 

AW609525 

1 5 ACCGCGGGTCAAATTTATTCATAATTAGCTCAATCATGAAAGCGGTTCTAAAGTGCTCTA 
CAGAGCTCTAGATAGAAAATATGAGGCTAACGATCATGGCAGCTAGTACTGGTTATCGTG 
ATTATGGCCACTGTCAGGATGAATGATAATGACTGGGCCAGGTCCTTTGGAAACCCTGGT 
GGAGTGGGCTGTCACATGGGGTCCCGTCTCCCTGCACATACTGGGTACCCAGGCCGCTCC 
TGAGGAACAGTCCAGCAGCCAGTGGCCTGGGAAGGGTGTGGTCTCTAGGGGCCTCTCAGC 

20 AGAGTCCTTGGCCCCAGCCTGGGCTTGGCAGGTCCCTGGTCTCCCCAGGACACCCCCACT 
TTCGCTCCTCCCACCCAGGCAAGGAGATCTCTTAAGGGGTAGCGCTGTTCTTCACCTTGG 
CGAGAACCTTCTTCTCTTTGAACCGGCGGTGCGGCGTGGGGTACCGAGC 

CB126919 

25 ATTTTTCTGTCGTGTGAAAATGAAGCCAGCAGGCTGCCCCTAGTCAGTCCTTCCTTCCAG 
AGAAAAAGAGATTTGAGAAAGTGCCTGGGTAATTCACCATTAATTTCCTCCCCCAAACTC 
TCTGAGTCTTCCCTTAATATTTCTGGTGGTTCTGACCAAAGCAAGTCATGGTTTGTTGAG 
CATTTGGGATCCCAGTGAAGTAGATGTTTGTAGCCTTGCATACTTAGCCCTTCCCAGGCA 
CAAACGGAGTGGCAGAGTGGTGCCAACCCTGTTTTCCCAGTCCACGTAGACAGATTCACA 

30 GTGCGGAATTCTGGAAGCTGGAGACAGACGGGCTCTTTGCAGAGCCGGGACTCTGAGAGG 
GACATGAAGGCCTCTGCCTCTGTGTTCATTCTCTGATGTCCTGTACCTGGGCTCAGTGCC 
CGGTGGGACTCATCTCCTGGCTGCGCAGCAAAGCCAGCGGGTTCGTGCTGGT 

AW609336 

35 CCAACGAGAAGAAGGTTCTCGCAAGGTGAAGAACAGCGCTACCCCTTAAGAGATCTCCTT 
GCGTGGGTGGGAGGAGCGAAAGTGGGGGTGTCCTGGGGAGACCAGGAACCTGCCAGCCCA 
GGCTGAGGCCAAGGACTCTGCTGAGAGGCCCCTAGAGACAACACCCTTCCCAGGCCACTG 
GATGCTGAACTGTCCCTCAGGAGCGGCCTGGGTACCCAGTATGTGCAGGGAGACGGAA.ee 
CCATGTGACAGCCCACTCCACCAGGGTTCCCAAAGAACCTGGCCCCAGTCATAATCATTC 

40 ATCCTGACAGTGGCAATAATCACGATAACCAGTACTAGCTGCCATGATCGTAAGCCTCAT 
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ATTTGCTATCTAGAGCTCTGTAGAGCACTTTAGAAACCGCTTTCATGAATTGAGCTAATT 
ATGACTCAATTTGAACCGGCGTCCGGCGTG 

AW609244 

5 ACGCGCACCGCGGTCAAGAGAAGAAGGTTCTCGCAAGGTGAAGAACAGCGCTACCCCTTA 
AGAGATCTCCTTGCGTGGGTGGGAGGAGCGAAAGTGGGGGTGTCCTGGGGAGACCAGGAA 
* CCTGCCAAGCCCAGGCTGTGGCCAAGGACTCTGCTGAGAGGCCCCTATGAGACAACACCC 
TTCCCAGGCCACTGGCTGCTGGGACTGTTCCTCAGGAGCGGCCTGGGTACCCGAGTAATG 
TGCAGGGGAGACGGAACCCCATGTGACAGCCCACTCCACCAGGGTTCCCAAAAGAACCCT 
1 0 GGCCCAGTCATAATCATTCATCCTGACAGTGGCAATAATCACGATAACCAGTACTAGCTG 
CCATGATCGTAAGCCTCATATTTGCTATCTAGAGCTCTGTAGAGCCCTTTAGAAACCGCT 
TTCATGAATGGAGCTAAATTATGAATACATTTGAACCGGCGATCCGACGTGA 

BF855145 

1 5 CTAGAGGATCCCGGAAGCAACTGCAACAGGTTCCCAAAGAACCGGGCCAGTCATAATCAT 
TCATCCTGACAGGGCAATAATCACGATAACCAGTACTAGCTGCCATGATCGTTAGCCTCA 
TATTTTCTATCTAGAGCTCTGTAGAGCACTTTAGAAACCGCTTTCATGAATGGAGCTAAT 
TATGAATAAATTTGGAAGGCGATCCCTTGGCAGGGAAGCTTTCTCTCAGACCCCCTTCCA 
TTACACCTCTCACCCTGGTAACAGCAGGAAGACTGAGGAGAGGGGAACGGGCAGATTCGT 

20 GGTGTTGCAGTGTGCTTCCG 

AU 12691 4 

GAGCGAATGCAGGCGACTTGCGAGCTGGGAGCGATTTAAAACGCTTTGGATTCCCCCGGC 
CTGGGTGGGGAGAGCGAGCTGGGTGCCCCCTAGATTCCCCGCCCCCGCACCTCATGAGCC 

25 GACCCTCGGCTCCATGGAGCCCGGCAATTATGCCACCTTGGATGGAGCCAAGGATATCGA 
AGACTTGCTGGGAGCGGGAGGGGGGCGGAATCTGGTCGCCCACTCCCCTCTGACCAGCCA 
CCCAGCGGCGCCTACGCTGATGCCTGCTGTCAACTATGCCCCCTTGGATCTGCCAGGCTC 
GGCGGAGCCGCCAAAGCAATGCCACCCATGCCCTGGGGTGCCCCAGGGGACGTCCCCAGC 
TCCCGTGCCTTATGGTTACTTTGGAGGCGGGTNCTACTCCTGCCGAGTGTCCCGGAGCTC 

30 GCTGAAACCCTGTGCCCANNCANCCACCCTGGCCGCGTN 

CB 126449 

CTCTGCCTCTGTGTTCATTCTCTGATGTCCTGTACCTGTGCTCAGTGCCCGGTGGGACTC 
ATCTCCTGGCTGCGCAGCAAAGCCAGCGGGTTCGTGCTGGTCCTTCCTGCACCTTCGGCT 
35 GGGGGTGGGGGGCCTGCCGGCGCATTCTCCACGATT 

AW582404 

ACGCTGCACCGCCGGTCCAAGAGAAGAAGGTTCTCGCCAAGGTGAAGAACAGCGCTACCC 
CTTTAAGAGATCTCCTTGCTGGGGTGGGAGGAGCGAAAGTGGGGGTGTCTGGGGAGACCA 
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GGAACCTGCCAGCCCCAGGCTGGGCCCAAGGACTCTGCTGAGAGGCCCCTAGAGACAACA 
CCCTTCCCAGGCCACTGTCTGCTGGACTGTTCCTCAGGAGCGGCCTGGGTACNCAGTATG 
TGCAGGGAGACGGAACCCCATGTGACAGCCCACTCCACCAGGGTTCCCAAAGAACCTGGC 
CCAGTCATAATCATTCATCCTGACAGTGGCAATAATCACGATAACCAGTACTAGCTGCCA 
5 TGATCGTTAGCCTCATATTTTCTATCTAGAGCTCTGTAGAGCACTTTAGAAACCGCTTTC 
ATGAATTGAGCTACTTATGAATCACTTTGAACCGGCGGTGCGGCGTG 

BX641644 

GGGGGAGAGCGAGCTGGGTGCCCCCTAGATTCCCCGCCCCCGCACCTCATGAGCCGACCC 
1 0 TCGGCTCCATGGAGCCCGGCAATTATGCCACCTTGGATGGAGCCAAGGATATCGAAGGCT 
TGCTGGGAGCGGGAGGGGGGCGGAATCTGGTCGCCCACTCCCCTCTGACCAGCCACCCAG 
CGGCGCCTACGCTGACGCCTGCTGTCAACTATGCCCCCTTGGATCTGCCAGGCTCGGCGG 
AGCCGCCAAAGCAATGCCACCCATGCCCTGGGGTGCCCCAGGGGACGTCCCCAGCTCCCG 
TGCCTTATGGTTACTTTGGAGGCGGGTACTACTCCTGCCGAGTGTCCCGGAGCTCGCTGA 
1 5 AACCCTGTGCCCAGGCAGCCACCCTGGCCGCGTACCCCGCGGAGACTCCCACGGCCGGGG 
AAGAGTACCCCAGCCGCCCCACTGAGTTTGCCTTCTATCCGGGATATCCGGGAACCTACC 
AGCCTATGGCCAGTTACCTGGACGTGTCTGTGGTGCAGACTCTGGGTGCTCCTGGAGAAC 
CGCGACATGACTCCCTGTTGCCTGTGGACAGTTACCAGTCTTGGGCTCTCGCTNGTGGCT 
GGAACAGCCAGATGTGTTGCCAGGGAGAACAGAACCCACCAGGTCCCTTTTGGAAGGCAG 
20 CATTTG 

Sequences from Table 4 not disclosed above 



AW006861 (IMAGE Clone ID: :2497262) 

25 GCTGAGTTCTGAAGCTTCTGAGTTCTGCAGCCTCACCTCTGAGAAAACCTCTTTTCCACC 
AATACCATGAAGCTCTGCGTGACTGTCCTGTCTCTCCTCATGCTAGTAGCTGCCTTCTGC 
TCTCTAGCGCTCTCAGCACCAATGGGCTCAGACCCTCCCACCGCCTGCTGCTTTTCTTAC 
ACCGCGAGGAAGCTTCCTCGCAACTTTGTGGTAGATTACTATGAGACCAGCAGCCTCTGC 
TCCCAGCCAGCTGTGGTATTCCAAACCAAAAGAAGCAAGCAAGTCTGTGCTGATCCCAGT 

30 GAATCCTGGGTCCAGGAGTACGTGTATGACCTGGAACTGAACTGAGCTGCTCAGAGACAG 
GAAGTCTTCAGGGAAGGTCACCTGAGCCCGGATGCTTCTCCATGAGACACATCTCCTCCA 
TACTCAGGACTCCTCTCCGCAGTTCCTGTCCCTTCTCTTAATTTAATCTTTTTTATGTGC 
CGTGTTATTGTATTAGGTGTCATTTCCATTATTTATATTAGTTTAGCCAAAGGATAAGTG 
TCCCCTATGGGGATGGTCCACTGTCACTGTTTCTCTGCTGTTGCAAATACATGGATAACA 

35 CATTTGATTCTGTGTGTTTTCATAATAAAACTTTAAAATAAAATGCAAAAAAAAAAAAAA 
AAAA 

X59770 

GCCACGTGCTGCTGGGTCTCAGTCCTCCACTTCCCGTGTCCTCTGGAAGTTGTCAGGAGC 
40 AATGTTGCGCTTGTACGTGTTGGTAATGGGAGTTTCTGCCTTCACCCTTCAGCCTGCGGC 
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ACACACAGGGGCTGCCAGAAGCTGCCGGTTTCGTGGGAGGCATTACAAGCGGGAGTTCAG 
GCTGGAAGGGGAGCCTGTAGCCCTGAGGTGCCCCCAGGTGCCCTACTGGTTGTGGGCCTC 
TGTCAGCCCCCGCATCAACCTGACATGGCATAAAAATGACTCTGCTAGGACGGTCCCAGG 
AGAAGAAGAGACACGGATGTGGGCCCAGGACGGTGCTCTGTGGCTTCTGCCAGCCTTGCA 
5 GGAGGACTCTGGCACCTACGTCTGCACTACTAGAAATGCTTCTTACTGTGACAAAATGTC 
CATTGAGCTCAGAGTTTTTGAGAATACAGATGCTTTCCTGCCGTTCATCTCATACCCGCA 
AATTTTAACCTTGTCAACCTCTGGGGTATTAGTATGCCCTGACCTGAGTGAATTCACCCG 
TGACAAAACTGACGTGAAGATTCAATGGTACAAGGATTCTCTTCTTTTGGATAAAGACAA 
TGAGAAATTTCTAAGTGTGAGGGGGACCACTCACTTACTCGTACACGATGTGGCCCTGGA 

1 0 AGATGCTGGCTATTACCGCTGTGTCCTGACATTTGCCCATGAAGGCCAGCAATACAACAT 
CACTAGGAGTATTGAGCTACGCATCAAGAAAAAAAAAGAAGAGACCATTCCTGTGATCAT 
TTCCCCCCTCAAGACCATATCAGCTTCTCTGGGGTCAAGACTGACAATCCCGTGTAAGGT 
GTTTCTGGGAACCGGCACACCCTTAACCACCATGCTGTGGTGGACGGCCAATGACACCCA 
CATAGAGAGCGCCTACCCGGGAGGCCGCGTGACCGAGGGGCCACGCCAGGAATATTCAGA 

1 5 AAATAATGAGAACTACATTGAAGTGCCATTGATTTTTGATCCTGTCACAAGAGAGGATTT 
GCACATGGATTTTAAATGTGTTGTCCATAATACCCTGAGTTTTCAGACACTACGCACCAC 
AGTCAAGGAAGCCTCCTCCACGTTCTCCTGGGGCATTGTGCTGGCCCCACTTTCACTGGC 
CTTCTTGGTTTTGGGGGGAATATGGATGCACAGACGGTGCAAACACAGAACTGGAAAAGC 
AGATGGTCTGACTGTGCTATGGCCTCATCATCAAGACTTTCAATCCTATCCCAAGTGAAA 

20 TAAATGGAATGAAATAATTCAAACACAAAAAAAAAAAAAAAAAAAAAA 

AB000520 

GGATCCAAGCTATTGTCCTGCCCATGGCTTCCCATCTCAGGACGCTCTCTGGCCGCTATC 
ATCCCAGCAGTGGAGTTCAGCCCACTACTCTGAACCAGCCGCAGGTGGCTGCTATGGGAC 

25 TGAAGCCATGAATGGTGCCGGCCCTGGCCCCGCCGCAGCCGCCCCGGTCCCAGTCCCGGT 
CCCGGTCCCGGACTGGCGGCAGTTCTGCGAGCTGCATGCGCAGGCGGCCGCCGTGGACTT 
TGCGCACAAGTTCTGCCGTTTCCTGCGGGACAACCCAGCTTACGACACGCCCGACGCCGG 
CGCCTCCTTCTCCCGCCACTTCGCCGCCAACTTCCTGGACGTCTTCGGCGAGGAGGTGCG 
CCGCGTGCTGGTGGCTGGGCCGACGACTCGGGGCGCGGCCGTGAGCGCAGAGGCCATGGA 

30 GCCGGAGCTCGCGGACACCTCTGCACTCAAGGCGGCGTCCTACGGCCACTCGCGGAGCTC 
GGAGGACGTGTCCACGCACGCGGCCACCAAGGCCCGCGTTCGCAAGGGCTTCTCGCTGCG 
CAACATGAGCCTGTGCGTGGTGGACGGCGTGCGCGACATGTGGCACCGGCGCGCCTCGCC 
CGAGCCCGACGCGGCAGCTGCCCCGCGCACCGCCGAGCCCCGCGACAAGTGGACGCGGCG 
CCTGAGGCTGTCGCGGACGCTGGCTGCCAAGGTGGAGCTGGTGGACATTCAACGCGAGGG 

35 GGCGCTGCGCTTCATGGTGGCCGACGACGCGGCCGCGGGCTCCGGGGGCTCGGCTCAGTG 
GCAGAAGTGCCGCCTGCTCCTGCGCAGGGCTGTGGCCGAGGAACGCTTCCGCCTGGAGTT 
CTTCGTGCCGCCCAAAGCCTCCAGGCCCAAGGTCAGCATCCCACTGTCAGCCATCATTGA 
GGTCCGCACCACCATGCCCCTGGAAATGCCAGAGAAGGATAACACATTCGTCCTCAAGGT 
AGAGAATGGAGCCGAATACATCTTGGAGACCATCGACTCTCTGCAGAAGCACTCGTGGGT 

40 AGCTGACATCCAGGGCTGCGTGGACCCCGGTGACAGTGAGGAAGACACCGAGCTCTCCTG 
TACCCGAGGAGGCTGTCTGGCCAGCCGCGTGGCCTCCTGCAGCTGTGAGCTCCTGACTGA 
TGCAGTCGACCTGCCCCGCCCCCCAGAGACGACAGCCGTGGGTGCAGTGGTGACAGCCCC 
CCACAGCCGAGGTCGAGATGCCGTCAGAGAATCCCTGATCCACGTCCCGCTAGAGACCTT 
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TCTGCAGACCCTGGAATCCCCGGGCGGCAGCGGCAGTGACAGCAATAACACAGGGGAACA 
GGGTGCAGAGACGGATCCCGAGGCTGAACCCGAGCTGGAGCTATCCGACTACCCATGGTT 
CCACGGGACACTGTCCCGGGTCAAGGCTGCTCAACTGGTTCTGGCAGGGGGGCCCCGGAA 
CCACGGCCTCTTCGTGATCCGCCAAAGTGAGACTCGGCCTGGGGAGTACGTGCTGACCTT 
5 CAACTTCCAGGGCAAGGCCAAGCACCTGCGCCTGTCCCTGAACGGCCACGGCCAGTGTCA 
CGTACAGCATCTGTGGTTCCAGTCTGTGCTTGACATGCTCCGCCACTTCCACACACACCC 
CATCCCACTGGAGTCAGGGGGCTCGGCCGACATCACCCTTCGCAGCTATGTGCGGGCCCA 
GGACCCCCCACCAGAGCCGGGCCCCACGCCCCCTGCCGCGCCCGCGTCCCCGGCCTGCTG 
GAGCGACTCGCCCGGCCAGCACTACTTCTCCAGCCTCGCCGCGGCCGCCTGCCCGCCTGC 

10 CTCGCCCTCCGACGCCGCCGGCGCCTCCTCGTCTTCCGCCTCGTCGTCCTCTGCCGCGTC 
GGGGCCCGCCCCCCCGCGCCCCGTCGAGGGCCAGCTCAGCGCGCGGAGCCGCAGCAACAG 
CGCCGAGCGCCTGCTGGAGGCCGTGGCCGCCACCGCCGCCGAGGAGCCCCCGGAGGCCGC 
GCCCGGCCGCGCGCGCGCCGTGGAGAACCAGTACTCCTTCTACTAGCCCGCGGCGCCGCC 
CGGGTGGGACACGCCAAGCTCTTCAGTGAAGACACGATGTTATTAAAAGCCTGTTTTAGG 

15 GACTGCAAAA 

AI820604 (IMAGE Clone Id: 1605108 

GATTCCAGCACGGGCTTCGCAGACTGCAGGACACAGAGGCACGCGTGCACATCATGTCTT 
CTAAGGAATTTGAACACTGTTGAGAAGACTGTGTACAAGAGAGATGTGCCATGTCAGCCT 

20 TGCAAGGGACAGCGTGAAAACTACCCATCTCCGGTCACCAAGTTGCAGGAGGCCAGGAGC 
CAGGAGGGGAAACCGCTCAGTTTGCAAAACGTCGCTTCCACAAGCCTGATGGCTGAAACT 
GCTCACTGTACCCTGAAACCAGCTTTACCTACAGCTTCTGAGATAAACTGCTGCAACTCT 
GGGACCCACGATGCCTATCACAGTGGCTCATCAATGGAACCTGCCGGCTCCCAACCCTTC 
CTAGGGCCCATGAACTCTCTGAAAAGAGGAACAGAAATATTTCTCCTTTTTGTAAAATCT 

25 TTAACCTTCCCTTTGTTCTTCATGTACACGCTGAACTGCAATTCTTCTTCCCAAATAAAA 
CATTAAATTTAAAAAA 

Al 087057 (MAGE Clone ID: 167 1188) 

GGCCCCGGAGGGAGAGTAACCCGGCCCATCCATCCGTCGCCCGGTTCTTGGGGAACTACT 
30 TTCAGGGGCTTCTTGCCGTCCCCTCATCAGCTCTGTGCGAACCCTCTGTCGGCAGCCATT 
GAGGAGACCCTGCCCCCTGGACCCTGACCACATATAGATTGAGGCCGAGGAGTGGCTGCC 
CTGTCCCTTTTATGACAGCCCGCAGAAGCCCCGGGGTGAGGCATGGAGGAGGCAGGCGAC 
AGCTGACAGGGACCCTGTTGGCCTCCAGCATGTCCAGCCAGCCGGGCAGGATTTCTCTGC 
TTCTGGCTGGCAGCCAGGAACTGAGTATGACAATGTTGTACTAAAGAAAGGCCCAAAGTG 
35 ACAGAGGCAGCAGAGGGATGGTCCACCGCCCCTTGGCTTCTGCTGGTGACTCCTCCTGGC 
CACTGCATCAGAAGAACCTCCTCTGCCCCTTCTGGAGCCCGAGGCCTGGCCTGTCTTCGT 
TGGGGCTGATAAATTGCCTCTCCCAGGGCCTGCTGGGTGAGTCACCATCCCAAAGCAGGA 
AGGGTGCCCTGGAGAGAACCACCCTCCTCCTACTCTTTTTCCACTTCCTCCTCTTTCTTT 
CCCCAGCTGAGGAGGAAGCTGGGGCATTTAGGGCAGAGGACAAAAGGATGTCAGCAATTG 
40 CTTGGGCTGCTTGGCTATGCAAGCCTCCTGCCTGCTGATGGCCACTTCAGGGACAGCCTG 
GGCCCAGGCACCCAGGGGGATGGCGGCAGCTTCCTGCACCTTTCAGATTTCTTGGTGGCA 
TTAAAGCATTTTCAGAACAAAAAAAAAAAAAAAAAAAAAAAAAA 
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AJ272267 

GGCGGGCCTGGACGGCCGCGTGCTGTACTGGCCACGCGGCCGCGTCTGGGGTGGCTCCTC 
ATCCCTCAATGCCATGGTCTACGTCCGTGGGCACGCCGAGGACTACGAGCGCTGGCAGCG 
5 CCAGGGCGCCCGCGGCTGGGACTACGCGCACTGCCTGCCCTACTTCCGCAAGGCGCAGGG 
CCACGAGCTGGGCGCCAGCCGGTACCGGGGCGCCGATGGCCCGCTGCGGGTGTCCCGGGG 
CAAGACCAACCACCCGCTGCACTGCGCATTCCTGGAGGCCACGCAGCAGGCCGGCTACCC 
GCTCACCGAGGACATGAATGGCTTCCAGCAGGAGGGCTTCGGCTGGATGGACATGACCAT 
CCATGAAGGCAAACGGTGGAGCGCGGCCTGTGCCTACCTGCACCCAGCACTGAGCCGCAC 

1 0 CAACCTCAAGGCCGAGGCCGAGACGCTTGTGAGCAGGGTGCTATTTGAGGGCACCCGTGC 
AGTGGGCGTGGAGTATGTTAAGAATGGCCAGAGCCACAGGGCTTATGCCAGCAAGGAGGT 
GATTCTGAGTGGAGGTGCCATCAACTCTCCACAGCTGCTCATGCTCTCTGGCATCGGGAA 
TGCTGATGACCTCAAGAAACTGGGCATCCCTGTGGTGTGCCACCTACCTGGGGTTGGCCA 
GAACCTGCAAGACCACCTGGAGATCTACATTCAGCAGGCATGCACCCGCCCTATCACCCT 

1 5 CCATTCAGCACAGAAGCCCCTGCGGAAGGTCTGCATTGGTCTGGAGTGGCTCTGGAAATT 
CACAGGGGAGGGAGCCACTGCCCATCTGGAAACAGGTGGGTTCATCCGCAGCCAGCCTGG 
GGTCCCCCACCCGGACATCCAGTTCCATTTCCTGCCATCCCAAGTGATTGACCACGGGCG 
GGTCCCCACCCAGCAGGAGGCTTACCAGGTACATGTGGGGCCCATGCGGGGCACGAGTGT 
GGGCTGGCTCAAACTGAGAAGTGCCAATCCCCAAGACCACCCTGTGATCCAGCCCAACTA 

20 CTTGTCAACAGAAACTGATATTGAGGATTTCCGTCTGTGTGTGAAGCTCACCAGAGAAAT 
TTTTGCACAGGAAGCCCTGGCTCCGTTCCGAGGGAAAGAGCTCCAGCCAGGAAGCCACAT 
TCAGTCAGATAAAGAGATAGATGCCTTTGTGCGGGCAAAAGCCGACAGCGCCTACCACCC 
CTCGTGCACCTGTAAGATGGGCCAGCCCTCCGATCCCACTGCCGTGGTGGATCCGCAGAC 
AAGGGTCCTCGGGGTGGAAAACCTCAGGGTCGTCGATGCCTCCATCATGCCTAGCATGGT 

25 CAGCGGCAACCTGAACGCCCCCACAATCATGATCGCAGAGAAGGCAGCTGACATTATCAA 
GGGGCAGCCTGCACTCTGGGACAAAGATGTCCCTGTCTACAAGCCCAGGACGCTGGCCAC 
CCAGCGCTAAGACAGTTGCTGCTGGAGGATGACCAGGGAAGCCCCCTGATAAGCCAAGAG 
GGCCAGCACAGCCCTTGCTCCCAGGCTCCTGCCTGAAACTATCTAGCACACTAGGACCCA 
GGTGGTACCCTACTCAGTGGCTGAGAATTGGATAAAGTCTTKGGGAAATGAGACAAGTAC 

30 TGGGCAGTGAATCCAGCTCCTTTTCCCCAGCCTTTCCCTGTGGGCCATTTGGGGAAGGCC 
AGCATTYCAGCCTGAGATGTTCCTCCCTGCCTCCTGGGGGGGCARAAGGGVTAGGWTGGT 
TAACTCCTGCCGCATCCTTCCCTGCCTCCTGGAGGGACAGAAGGGGAGGATGGTTAACTC 
CTGCCGCATCCTTTTTCTTGTGTTCACGTGGCATTCTCTAACCCAGGGCAGTGGTTCCTT 
CCCAGGCCATGCACAGAGGCTGGGTGCCTGCCAGACCCACGGAGGGTTCGCGAAGGAAGG 

35 GGCATCCTCCTTCTTGAGCTGCAAGCTTTAGCTGAGGCAGTAAGTCACACAGTAGTTAGT 
TCAGCCTGGGCTGGCACATAAGTCCCCAGTGTCCCTGTTGAGAGGGGAAAGTTGCCTGCT 
GGTTGAAAAACTGGCTTTTCCTTTCTCGCTGCCTAATTTCACTCTCAGAGTGAGGCAGGT 
AACTGGGGCTCCACTGGGTCACTCTGAGAGGGTTGTGGCTCTGGTTCTTATTAAACCAGG 
GCCAGGTGCAGGGCTCACACCTGTAATCCCAGCACTTTGGGAAGGTCACTTGAGCTCAGG 

40 AGTTCAAGACCAGCCTGGGCAACATAGTGAGACCTTGTCTCTGGAAAACAATTAGCTGGG 
CATGGTGGTACACACCTGTAGTCCCAGCTACTTGGGAGGCTGAGGCGGGAGGATGGCTTT 
AGCCCAGGAGGTTGAGGCTCCTGTGAACCCTGATGGCACCACTGCACTCCAGCCTGGGTG 
ACAGGGTGAGACCCTGTCTCAAAAAAAAA 
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N30081 (IMAGE Clone ID: 258695) 

CCGCCGTTGNCAAAGGGCCCAGAATATGGGCCATGGACNATCTCCATGCCTGGGGAAATT 
CCCTCGGGTCTTTTGGNTAACCNCCTTATAGAAAGGTAATGNCATGGAGTCTCTACAGGG 
5 NGCACAAGGTGGACTAATTGATACGAAGAGCCCTGTAAATATGTGGGCAGCGGCAGATTT 
TGACCATTTGGACCGAACTGTATTTGACACAGCGCAATATCTGGAACTGGTTGGTCAAAA \ 
ACCTGCTTGTCTTGTTAAATTTCCTCTGTCCAAGGACATGGAATCTCTCTCTAATTTTAC 
TTCAAATTTCCCTTTCCTTCATTTCTCTAAAAACGTTAAATAAGAAAGAAGATTGTAAAG 
CCAGCATTTGAAGCCTAAGTATTGAAAGTCTTTGACAATTTCTGAAATCAGACTTGACAT 
1 0 CTTTCCCCCGCCTTGCAAATTTCTTGAAGAAATAAGAAGCTACATGTAAGCATCATCATG 
TTTATTAAATTACAATGAGAACTCTCACTCAATCTTGACCAGAGCAGACTCTTAACTTGG 
AAGCAGAGTCCCTCTAAAGGTAACTCTTGTGGTCACTCAATATTGTATTGGCATTTGCAT 
ATTAAATAGACATTTCAGTAGCATTT 

15 AI700363 (IMAGE Clone ID: 2327403) 

TGGCCCGCGGTCGCGGTGGGATCCTAGCCCTGTCTCCTCTCCTGGGAAGGAGTGAGGGTG 
GGACGTGACTTAGACACCTACAAATCTATTTACCAAAGAGGAGCCCGGGACTGAGGGAAA 
AGGCCAAA.GAGTGTGAGTGCATGCGGACTGGGGGTTCAGGGGAAGAGGACGAGGAGGAGG 
AAGATGAGGTCGATTTCCTGATTTAAAAAATCGTCCAAGCCCCGTGGTCCAGCTTAAGGT 

20 CCTCGGTTACATGCGCCGCTCAGAGCAGGTCACTTTCTGCCTTCCACGTCCTCCTTCAAG 
GAAGCCCCATGTGGGTAGCTTTCAATATCGCAGGTTCTTACTCCTCTGCCTCTATAAGCT 
CAAACCCACCAACGATCGGGCAAGTAAACCCCCTCCCTCGCCGACTTCGGAACTGGCGAG 
AGTTCAGCGCAGATGGGCCTGTGGGGAGGGGGCAAGATAGATGAGGGGGAGCGGCATGGT 
GCGGGGTGACCCCTTGGAGAGAGGAAAAAGGCCACAAGAGGGGCTGCCACCGCCACTAAC 

25 GGAGATGGCCCTGGTAGAGACCTTTGGGGGTCTGGAACCTCTGGACTCCCCATGCTCTAA 
CTCCCACACTCTGCTATCAGAAACTTAAACTTGAGGATTTTCTCTGTTTTTCACTCGCAA 
TAAATTCAGAGCAAACAAAAAAAAAAAAAAA 

AL1 17406 

30 CAATAGGCCGGCTTTTGAACTGCTTCGCAGGGGACTTGGAACAGCTGGACCAGCTCTTGC 
CCATCTTTTCAGAGCAGTTCCTGGTCCTGTCCTTAATGGTGATCGCCGTCCTGTTGATTG 
TCAGTGTGCTGTCTCCATATATCCTGTTAATGGGAGCCATAATCATGGTTATTTGCTTCA 
TTTATTATATGATGTTCAAGAAGGCCATCGGTGTGTTCAAGAGACTGGAGAACTATAGCC 
GGTCTCCTTTATTCTCCCACATCCTCAATTCTCTGCAAGGCCTGAGCTCCATCCATGTCT 

35 ATGGAAAAACTGAAGACTTCATCAGCCAGTTTAAGAGGCTGACTGATGCGCAGAATAACT 
ACCTGCTGTTGTTTCTATCTTCCACACGATGGATGGCATTGAGGCTGGAGATCATGACCA 
ACCTTGTGACCTTGGCTGTTGCCCTGTTCGTGGCTTTTGGCATTTCCTCCACCCCCTACT 
CCTTTAAAGTCATGGCTGTCAACATCGTGCTGCAGCTGGCGTCCAGCTTCCAGGCCACTG 
CCCGGATTGGCTTGGAGACAGAGGCACAGTTCACGGCTGTAGAGAGGATACTGCAGTACA 

40 TGAAGATGTGTGTCTCGGAAGCTCCTTTACACATGGAAGGCACAAGTTGTCCCCAGGGGT 
GGCCACAGCATGGGGAAATCATATTTCAGGATTATCACATGAAATACAGAGACAACACAC 
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CCACCGTGCTTCACGGCATCAACCTGACCATCCGCGGCCACGAAGTGGTGGGCATCGTGG 
GAAGGACGGGCTCTGTAGGTTTTTACTGAGCACCTACTATGTGCCTGGGAACCGAAAGGG 
AAGTCCTCCTTGGGCATGGCTCTCTTCCGCCTGGTGGAGCCCATGGCAGGCCGGATTCTC 
ATTGACGGCGTGGACATTTGCAGCATCGGCCTGGAGGACTTGCGGTCCAAGCTCTCAGTG 
5 ATCCCTCAAGATCCAGTGCTGCTCTCAGGAACCATCAGATTCAACCTAGATCCCTTTGAC 
CGTCACACTGACCAGCAGATCTGGGATGCCTTGGAGAGGACATTCCTGACCAAGGCCATC 
TCAAAGTTCCCCAAAAAGCTGCATACAGATGTGGTGGAAAACGGTGGAAACTTCTCTGTG 
GGGGAGAGGCAGCTGCTCTGCATTGCCAGGGCTGTGCTTCGCAACTCCAAGATCATCCTT 
ATCGATGAAGCCACAGCCTCCATTGACATGGAGACAGACACCCTGATCCAGCGCACAATC 

1 0 CGTGAAGCCTTCCAGGGCTGCACCGTGCTCGTCATTGCCCACCGTGTCACCACTGTGCTG 
AACTGTGACCACATCCTGGTTATGGGCAATGGGAAGGTGGTAGAATTTGATCGGCCGGAG 
GTACTGCGGAAGAAGCCTGGGTCATTGTTCGCAGCCCTCATGGCCACAGCCACTTCTTCA 
CTGAGATAAGGAGATGTGGAGACTTCATGGAGGCTGGCAGCTGAGCTCAGAGGTTCACAC 
AGGTGCAGCTTCGAGGCCCACAGTCTGCGACCTTCTTGTTTGGAGATGAGAACTTCTCCT 

1 5 GGAAGCAGGGGTAAATGTAGGGGGGGTGGGGATTGCTGGATGGAAACCCTGGAATAGGCT 
ACTTGATGGCTCTCAAGACCTTAGAACCCCAGAACCATCTAAGACATGGGATTCAGTGAT 
CATGTGGTTCTCCTTTTAACTTACATGCTGAATAATTTTATAATAAGGTAAAAGCTTATA 
GTTTTCTGATCTGTGTTAGAAGTGTTGCAAATGCTGTACTGACTTTGTAAAATATAAAAC 
TAAGGAAAAC TCAAAAAAAAAAAA 

20 

M92432 

CCCACAGGGGGACCGGCCCTGTGACCCCTCACCGGGGCCGTGGGCCCGAGCCCCGGACTT 
CCCTAAGCCGGCAATGACCGCCTGCGCCCGCCGAGCGGGTGGGCTTCCGGACCCCGGGCT 
CTGCGGTCCCGCGTGGTGGGCTCCGTCCCTGCCCCGCCTCCCCCGGGCCCTGCCCCGGCT 

25 CCCGCTCCTGCTGCTCCTGCTTCTGCTGCAGCCCCCCGCCCTCTCCGCCGTGTTCACGGT 
GGGGGTCCTGGGCCCCTGGGCTTGCGACCCCATCTTCTCTCGGGCTCGCCCGGACCTGGC 
CGCCCGCCTGGCCGCCGCCCGCCTGAACCGCGACCCCGGCCTGGCAGGCGGTCCCCGCTT 
CGAGGTAGCGCTGCTGCCCGAGCCTTGCCGGACGCCGGGCTCGCTGGGGGCCGTGTCCTC 
CGCGCTGGCCCGCGTGTCGGGCCTCGTGGGTCCGGTGAACCCTGCGGCCTGCCGGCCAGC 

30 CGAGCTGCTCGCCGAAGAAGCCGGGATCGCGCTGGTGCCCTGGGGCTGCCCCTGGACGCA 
GGCGGAGGGCACCACGGCCCCTGCCGTGACCCCCGCCGCGGATGCCCTCTACGCCCTGCT 
TCGCGCATTCGGCTGGGCGCGCGTGGCCCTGGTCACCGCCCCCCAGGACCTGTGGGTGGA 
GGCGGGACGCTCACTGTCCACGGCACTCAGGGCCCGGGGGCTGCCTGTCGCCTCCGTGAC 
TTCCATGGAGCCCTTGGACCTGTCTGGAGCCCGGGAGGCCCTGAGGAAGGTTCGGGACGG 

35 GCCCAGGGTCACAGCAGTGATCATGGTGATGCACTCGGTGCTGCTGGGTGGCGAGGAGCA 
GCGCTACCTCCTGGAGGCCGCAGAGGAGCTGGGCCTGACCGATGGCTCCCTGGTCTTCCT 
GCCCTTCGACACGATCCACTACGCCTTGTCCCCAGGCCCGGAGGCCTTGGCCGCACTCGC 
' CAACAGCTCCCAGCTTCGCAGGGCCCACGATGCCGTGCTCACCCTCACGCGCCACTGTCC 
CTCTGAAGGCAGCGTGCTGGACAGCCTGCGCAGGGCTCAAGAGCGCCGCGAGCTGCCCTC 

40 TGACCTCAATCTGCAGCAGGTCTCCCCACTCTTTGGCACCATCTATGACGCGGTCTTCTT 
GCTGGCAAGGGGCGTGGCAGAAGCGCGGGCTGCCGCAGGTGGCAGATGGGTGTCCGGAGC 
AGCTGTGGCCCGCCACATCCGGGATGCGCAGGTCCCTGGCTTCTGCGGGGACCTAGGAGG 
AGACGAGGAGCCCCCATTCGTGCTGCTAGACACGGACGCGGCGGGAGACCGGCTTTTTGC 
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CACATACATGCTGGATCCTGCCCGGGGCTCCTTCCTCTCCGCCGGTACCCGGATGCACTT 
CCCGCGTGGGGGATCAGCACCCGGACCTGACCCCTCGTGCTGGTTCGATCCAAACAACAT 
CTGCGGTGGAGGACTGGAGCCGGGCCTCGTCTTTCTTGGCTTCCTCCTGGTGGTTGGGAT 
GGGGCTGGCTGGGGCCTTCCTGGCCCATTATGTGAGGCACCGGCTACTTCACATGCAAAT 
5 GGTCTCCGGCCCCAACAAGATCATCCTGACCGTGGACGACATCACCTTTCTCCACCCACA 
TGGGGGCACCTCTCGAAAGGTGGCCCAGGGGAGTCGATCAAGTCTGGGTGCCCGCAGCAT 
GTCAGACATTCGCAGCGGCCCCAGCCAACACTTGGACAGCCCCAACATTGGTGTCTATGA 
GGGAGACAGGGTTTGGCTGAAGAAATTCCCAGGGGATCAGCACATAGCTATCCGCCCAGC 
AACCAAGACGGCCTTCTCCAAGCTCCAGGAGCTCCGGCATGAGAACGTGGCCCTCTACCT 

1 0 GGGGCTTTTCCTGGCTCGGGGAGCAGAAGGCCCTGCGGCCCTCTGGGAGGGCAACCTGGC 
TGTGGTCTCAGAGCACTGCACGCGGGGCTCTCTTCAGGACCTCCTCGCTCAGAGAGAAAT 
AAAGCTGGACTGGATGTTCAAGTCCTCCCTCCTGCTGGACCTTATCAAGGGAATAAGGTA 
TCTGCACCATCGAGGCGTGGCTCATGGGCGGCTGAAGTCACGGAACTGCATAGTGGATGG 
CAGATTCGTACTCAAGATCACTGACCACGGCCACGGGAGACTGCTGGAAGCACAGAAGGT 

15 GCTACCGGAGCCTCCCAGAGCGGAGGACCAGCTGTGGACAGCCCCGGAGCTGCTTAGGGA 
CCCAGCCCTGGAGCGCCGGGGAACGCTGGCCGGCGACGTCTTTAGCTTGGCCATCATCAT 
GCAAGAAGTAGTGTGCCGCAGTGCCCCTTATGCCATGCTGGAGCTCACTCCCGAGGAAGT 
GGTGCAGAGGGTGCGGAGCCCCCCTCCACTGTGTCGGCCCTTGGTGTCCATGGACCAGGC 
ACCTGTCGAGTGTATCCTCCTGATGAAGCAGTGCTGGGCAGAGCAGCCGGAACTTCGGCC 

20 CTCCATGGACCACACCTTCGACCTGTTCAAGAACATCAACAAGGGCCGGAAGACGAACAT 
CATTGACTCGATGCTTCGGATGCTGGAGCAGTACTCTAGTAACCTGGAGGATCTGATCCG 
GGAGCGCACGGAGGAGCTGGAGCTGGAAAAGCAGAAGACAGACCGGCTGCTTACACAGAT 
GCTGCCTCCGTCTGTGGCTGAGGCCTTGAAGACGGGGACACCAGTGGAGCCCGAGTACTT 
TGAGCAAGTGACACTGTACTTTAGTGACATTGTGGGCTTCACCACCATCTCTGCCATGAG 

25 TGAGCCCATTGAGGTTGTGGACCTGCTCAACGATCTCTACACACTCTTTGATGCCATCAT 
TGGTTCCCACGATGTCTACAAGGTGGAGACAATAGGGGACGCCTATATGGTGGCCTCGGG 
GCTGCCCCAGCGGAATGGGCAGCGACACGCGGCAGAGATCGCCAACATGTCACTGGACAT 
CCTCAGTGCCGTGGGCACTTTCCGCATGCGCCATATGCCTGAGGTTCCCGTGCGCATCCG 
CATAGGCCTGCACTCGGGTCCATGCGTGGCAGGCGTGGTGGGCCTCACCATGCCGCGGTA 

30 CTGCCTGTTTGGGGACACGGTCAACACCGCCTCGCGCATGGAGTCCACCGGGCTGCCTTA 
CCGCATCCACGTGAACTTGAGCACTGTGGGGATTCTCCGTGCTCTGGACTCGGGCTACCA 
GGTGGAGCTGCGAGGCCGCACGGAGCTGAAGGGCAAGGGCGCCGAGGACACTTTCTGGCT 
AGTGGGCAGACGCGGCTTCAACAAGCCCATCCCCAAACCGCCTGACCTGCAACCGGGGTC 
CAGCAACCACGGCATCAGCCTGCAGGAGATCCCACCCGAGCGGCGACGGAAGCTGGAGAA 

35 GGCGCGGCCGGGCCAGTTCTCTTGAGAAGTGAGGCCCGGCCCCGGACAGGGTCTGGGCCC 
TGCTCCCTGTCCCATCTGCAGTGGACCCCAGGCACCCCCCTTTGAGGAGGTGGGGTGAAC 
TGCTCCTTGGCAGGGATTTGTGACACTGCATTGCTGGGCTGTGTTCCTCGGGCTCTTCTG 
GACCTTGCACCGTGGATACCAGGCCATGTGCCATGGTATTTGGGTCCTGGGAGGGTGGGT 
GAAATAAAGGCATACTGTCTT 

40 

AL050227 

CTTTCACAGAAAGAAAGTAACAGGCATAATTCCTGTTGATGAGGCTGGGATTGTTTTTAA 
GAGGAGAGATAATAACTTCATATTTTTAAAGTGCCAGTAGCCTAATATGTGAAACAGATC 
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AGAATCTGTTGTGTAGTAAGTCTGCTTTGTTGAAGAATTTATTATGGGAGTAAAGATAAG 
AAGGAAAGAGATCACCATCAGAAACAAGTCAGCCTTTTCATGCTTTTTTGAGCATTTTTG 
GAGATGATTCCACTTCTCAAGTTATTATCATTTGTGCATCTCTTCAATGCTATTGTTAAA 
TGCTTTAGAATTAGAATATTTTGATCCTTTAATTAAAGTAAGCCAAACGTCTAGGCAAAA 
5 ACAGCCAATCATTAAACTTTAATAGTAATTCAAATATAGATTTCTCATACAGTTTTCCAT 
GTCTGTAGAAATCAAAGTTGTAATGTTAAGCAGAGGGAAATGCGTGTGATTTACTAATAC 
ACTTCAACGTTCTACTTTTGAAAGGATACTCATGTGGGTGGGGCAGAGAACATAGAAAAA 
GATATGATGGAAAACCTGTCCATTTTCTACCTGTTAACCTTCATCATTTTGTGCAGGCCC 
TGGAAGCAAAGAGAGGAAGGGACCGACTGCATTTATCTTTGAACACTTGAGCATCAGTAG 

1 0 TACTACTGAGTGGCCAGGGGTCTTGTCTGTCAAAGCAAATGATAAGTTCACTCAGGCCAT 
TATTGACTGCTGAACTCTCTTCCTTCCCAACTCTTCCTTGAAAGAGAAAAAAATACTTTG 
CCTTCTTGCTCTCCTTATCAAATGTTTTTGTACAAATAGTGTAAGCCTGTTTAAGCAAAC 
CAATTAAAA.TAGGCACTGATTATTTTGATCTGTTTGTAACAAATGAATGTAAGTACTATT 
TACATGGTGTGCCTAGGAGGAGCTGAAATCATTGGCACTTTAATCCATATTGTAAAGATC 

1 5 AGTATCAAAAGCATAGTGTTCTTCACCTCTCCTCCTCAGCATCCATCTCTATATACTTGA 
TTAAATGGAAAAGTCTCTTTTATCACCTCTATGTAAAGTTTTATGGGTAGTTATCGTCAG 
TGTATTTAAATATATCTTCTAGTATGTTTTAAAGGCTGGTCTTCAATACTGTGGAGACAA 
AAAATAAAAGAGCGTATGAAAAGTACGTTAGACTTTTGCTGGCATTCAAGTCATGGCTAG 
TCTGTGTATTTAATAAATGTGTGTTATTTATGTCGTGTTTGTCAATGGAAAATAAAGTTG 

20 AATATTCTGAAAAAAAAAAAAAAA 

AW613732 (IMAGE Clone ID: 2953502) 

CCTANAAGTNCCATTTTGGCAAGGATAAACTCCCATGACAANCTCCCANTACTGCATGTG 
AATGAATAAGAAACAAGAANTGACCACACCAAAGCCTCCCTGGCTGGTGTTACANGGGAT 

25 CAGGTCCACAGTGGTGCAGATTCAACCACCACCCAGGGAGTGCTTGCAGACTCTGCATAG 
ATGTTGCTGCATGCGTCCCATGTGCCTGTCAGAATGGCAGTGTTTAATTCTCTTGAAAGA 
AAGTTATTTGCTCACTATCCCCAGCCTCAAGGAGCCAAGGAAGAGTCATTCACATGGAAG 
GTCCGGGACTGGTCAGCCACTCTGACTTTTCTACCACATTAAATTCTCCATTACATCTCA 
CTATTGGTAATGGCTTAAGTGTAAAGAGCCATGATGTGTATATTAAGCTATGTGCCACAT 

30 ATTTATTTTTAGACTCTCCACAGCATTCATGTCAATATGGGATTAATGCCTAAACTTTGT 
AAATATTGTACAGTTTGTAAATCAATGAATAAAGGTTTTGAGTGTAAAAAAAAAAAAAAA 
AAAAAAA 

BC007783 (IMAGE Clone ID: 4308472) 

35 GGCACGAGGGCAAAGAGTAGTCAGTCCCTTCTTGGCTCTGCTGACACTCGAGCCCACATT 
CCATCACCTGCTCCCAATCATGCAGGTCTCCACTGCTGCCCTTGCCGTCCTCCTCTGCAC 
CATGGCTCTCTGCAACCAGGTCCTCTCTGCACCACTTGCTGCTGACACGCCGACCGCCTG 
CTGCTTCAGCTACACCTCCCGGCAGATTCCACAGAATTTCATAGCTGACTACTTTGAGAC 
GAGCAGCCAGTGCTCCAAGCCCAGTGTCATCTTCCTAACCAAGAGAGGCCGGCAGGTCTG 

40 TGCTGACCCCAGTGAGGAGTGGGTCCAGAAATACGTCAGTGACCTGGAGCCGAGTGCCTG 
AGGGGTCCAGAAGCTTCGAGGCCCAGCGACCTCAGTGGGCCCAGTGGGGAGGAGCAGGAG 
CCTGAGCCTTGGGAACATGCGTGTGACCTCCACAGCTACCTCTTCTATGGACTGGTTATT 
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GCCAAACAGCCACACTGTGGGACTCTTCTTAACTTAAATTTTAATTTATTTATACTATTT 
AGTTTTTATAATTTATTTTTGATTTCACAGTGTGTTTGTGATTGTTTGCTCTGAGAGTTC 
CCCCTGTCCCCTCCACCTTCCCTCACAGTGTGTCTGGTGACAACCGAGTGGCTGTCATCG 
GCCTGTGTAGGCAGTCATGGCACCAAAGCCACCAGACTGACAAATGTGTATCAGATGCTT 
5 TTGTTCAGGGCTGTGATCGGCCTGGGGAAATAATAAAGATGTTCTTTTAAACGGTAAAAA 
AAAA 

X81896 

AGAAAACTATTTTCTAAATATTAACACTGAAAATGTTTTGTTAGCTTTTCCTTCTTTCTC 
TCCAGAAGAAACATGGATAGATGATAGCTGTTTCATTGTTTGTTTTTGTCAAGCATATTC 
ACTTTCCTCCTTGTCCTCTGATTCTGAGCAAAGGGCCTCAGACTCTGAACTTCCCTCAAG 
TGCCGTTGTTATGTGAACTCTTCCATTCAGATTCCAGAGAGGTTCTCATGCTCCCCCCCC 
CTCCTTATTTGTAGCAATCGTAGCAACTAATTCCACTAAGTACAAGGGAGTTTTTTACAC 
TCCTCCATTTTTATAGCATCTGCATTTTTTTTTTTTGTTAGGTACATGTATACACCTGCC 
TGAGTATAAATACTCTCTCTACCTAATAATAACATCAACCAACATCTTTTCCAAATTAGG 
GCCACAGAACAGCAACATTTGTCTGACAGTAGTATAAAGAATAATGATAGCTCTATCCTT 
AAGAAGTATTTCCTTTCCTTTTTATATAGTCCCGTTAGGGTTTAAAACCATATTGATCAA 
CTAGAAAGAAAAATATGAAAAGAGAAAAATATTTTAATTTAAAAATTGTAATACATTGAT 
TTATAAAATGCCTTCTCTGATACTTTTGAAACAGATGTGAAAAACAGAAAAAGAAAAAAT 
TGTCTGAAATGTTTATTTTGCAAAACAGTGCAATAGAATCTAGTTATGCCTTCATCACTG 
TTGACAGTAAATACTGACAGCCCCTTGCAGTGTGTTAGTTTTAGATCACTCTGTTTTAGT 
TGAGAGAAATGTTTTATATCATGGTTTTTATATGAATACAAATTATTTCTCAAAGATTTA 
TAGCACACACTATTCTCAGGAATTCTGTATTACATGAATGCTGCTTATATATTTTCATAT 
TCTAACTTGTCTTTTCAAGCAAATAACTAATATATATGTGCATGCAGTCTGCCTTGACAA 
GTTGTTCCAAGCTGAAGAGCTTTCACTGTACAATGTGTGGAAAATCACCATAGATCATGG 
CTGAAATAGTTTGTAATTGTCTGAGTCTGTGCACGTACTTTTAGATAAAATGCTGCTGAG 
TGACTGCATGATGAGATACAACTTCTGAATGCTGCACATTCTTCCAAAATGATCCTTAGC 
ACAATCTATTGTATGATGGAATGAATAGAAAACTTTTTCACTCAATAAATTATTATTTGA 
TATGGTAAAAAAAAAA 

BC004960 (IMAGE Clone ID: 3632495) 

CCCAAGGTTGTTATATCTTCATGTCCTCATTTCTTAGGGAGGTACCTTCAGAACCAATAG 
TGACCCCTAACTTCTCTGGTGGTCGGTTCCATGAAAGGCAAAGGAGTGTGAGAGAGGAGT 
GGATGGTCAACCTCCCACTGCCATGGTAACATGGGTGCTGGCTGATGGGAGCAGAAAATA 
35 ATTTAGTGAAAGTCTGTGGGGGCAGTCACAAGATGTCTGAGAAAACTGGCGAGCCAGCTG 
CTGAAAACAGGGACAAGGAAGCCTCCGTGGCTGGAGCCCAAATCACACTGCAGACCCAGA 
CACCGTGACCACCACCATGGACTCCAGAGAGAGCAGCTTATAGTACTCAATCAGCTGCCA 
CTACCACCATCCAGAACACCAGATGTTGTAGCCATGGCTGCAGCAGGAATGGATGTCCCA 
CTGTCCCTGCTCCTCGGTGTGACTTGCTCCCAAGTTCAGGGCAGGTCCATCTGATTGGCT 
40 GAGTCTGGAATGTCTGCCTGTGCCTCAGCTGTGAGGGAGGCAGGGAAAGTAAGCCTTTTC 
AGCTTCTGTCGTGGGAGGTGGGCTCTGCCTCCTACCAAGAATCAAAGGGTGGAGGATCTT 
CAAACACAGGAAAAGAACCCGGATCCTGGCACCCCCAAATTTTCAGAGTCCATTTCAGAG 
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CATAAGAAATTGAGGGTCCAAGATCATTCATGTAAGAAGTTTAGAGGGGGAAGAAAAGAA 
TGATAAACGAAAAGAACAGCAATAGTAAAGGATCTTTTCTTTGTTTCAGTAAGATGAAGA 
GGCCTGAGCAGTTTCGTGGAGGGGAAGAAACAGGAAAACCTCTTCAAAAGACAAAAAGCT 
GGCACTGCATTCTCTCTCTGTAGCAGGACAGAACTGTCTAAAGACAAGACCCCTTTGGCC 
5 AAAATAAAGGAACCTGAAACATTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 
AAAAAAAAAAAAAAAAAAAAAAAAAACCTCGGG 

AK027250 

AAATTTAATTAATTATAAACTCAGTCTCTTGGTTGCACCAGCCACATTTCAGATGCTCAA 

1 0 TAGCCACATGTGGCTAGTGGGTACCATATTGGACAGGGCAGCTATAGAATATTTCCATCA 
TTGCAGAAAGTTCTATTGGATAGTACCATAATCTTTTTATAGTAACTTGGAAATACTATT 
TGATATTAGATGTTAGACCACAAAAAGAAGAAAAATGTTAGGACTATTTCAGATATAAAA 
AGGAACTGAATTGTGACATAATTAGCATCTTACATTCCATACAGTTGAATACCTTATGCT 
GTGACAACCATAGTTAATCATTTCAGTGCTGTTCAACATACATACCTATCAGCAGTGTGT 

1 5 TTAGACCAGGGGTCTGCAAACTTTCTGTGAATGGACAAAGAGTAAATACTTTAGTAAATG 
TCTTAGGCTTTGTGGCCTACATGATCTTTGTTGCAAGTACTCAACTCTGCCATTATAGAG 
TTAAAGCAGCCATACACAATATATAAACAAAATGGGCATAGTTGTATTTCAGTAAAACTT 
TATTTACAAAGACAGGCGGTAGGCCAGATTTGGCTTGCATGCTGTAGAGCTGTGGTCTAA 
ATTTTATTCATAGACTTTCTTTGCAAATACAGTGTGAGTATTGTTCCATTTACAGTATTA 

20 TTATTTTTTAGATACCTGGTTTTTAGATTCTTGCCTGGTAACTTTTTACTGAAAATACAA 
GAATTTCGTACTGCATTTGCATCTCCGAGATTAGGGAGCACCTGTCAGGATATGTTGTTC 
TATCAGGGTTACTTCTGTTGACTACCTCTTAGATTTTGATACAGTTATATTGTTGAGTTT 
CATTTTCATATATTCTTGTAGTGTCTGCTTGCCTGTGACTTCTGGTAAAATAAAATAAGC 
CTTTGAAAATATTTTAGCATGGTATTTAACATTTTCTAAATATTATGGCATTTTGACATA 

25 TTTTAGTCAGCGAAGACATCTGCCCCTTTGGTGTTTCTACTTGCTTATGATTGAGATTTT 
ACAAGCCCTTCAAACTCCGTTTTAAAGGAATTTATTGTAAAACATTAACTTTAATAAATT 
AGTGTTTTCACAGATCAGATCATTATACTTGGAACTTCTAAATCATGCAATTTCTGAATA 
AGGACATAAGGCTAGATTCATTTTTCTTAATAGAGAAAAAGGAAATTTCTGATTTATCAC 
TTTTCTAGTTGATAAGTAGGATTCAAAACGTTTGATATGTAAGTATTTATATAAGACTAA 

30 TGTAATTTAAAGTTCTGTATTATTGTGATTAATCATACAGAAATTCAGGAACTGATCAGA 
AGTGAGATTCTTTTCCACATCTGGTTAATGTAGTGAGTTGACACCCTGTGGGTGGTAAAG 
CATTATAAACATTTCATCTTGAACCATGATTTATACACATCTGTGTTATAAGGGAGGCTT 
GAGTACATATACCAATGAAGAGATATTCAGCATTTGTCTATTTGATAAGGAATTAAATGT 
CCTAGTGATTATAAAGTAAAACCACAGACCAATTTGCAAATGATCTTCAATGTTAAGCAC 

35 TTGCTCTAAGATTAAAATTCCTTTTCTTTTTAAGGTTAAGGGTGTGTACGTATGGCAGTG 
ATGTCTATGTTGAGATTAACTTATGTATTGAGGAAAATTTGAAGTTTATTTTTTCGATGA 
ATAAGGCTGTCAAATGATTTAGTATAGATTAATGACATCTTTTTTAGAAATATTAAAGTG 
AGTATTCCTCATTATGTCATCATTTCTGATAATTAGAGTGCTAATTTGAATGTTAGATAA 
TGTTTCCACATCTATACCTATTTCTTTCTAGGGCACTTCTGACCCTGGGGCTTGGGGATG 

40 GCCTTTAGGCCACAGTAGTGTCTGTGTTAAGTTCACTAAATGTGTATTTAATGAGAAACA 
TTCCTATGTAAAAATGTGTGTATGTGAACGTATGCATACATTTTTATTGTGCACCTGTAC 
ATTGTGAAGAAGTAGTTTGGAAATTTGTAAAGCACAAACCATAAAAGAGTGTGGAGTTAT 
TAAATGATGTAGCACAAATGTAATGTTTAGCTTATAAAAGGTCCTTTCTATTTTCTATGG 
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CAAAGACTTTGACACTTGAAAAATAAAACCAATATTTGATTTATTTTTGTAAGTATTTAG 
GATATTATTTTAAATAAATGATTGTCCATTATCAATAAAAAAAAAAAAAAAAAA 

Sequences from Table 5 not disclosed above 

5 

NM_0 14298 

GTCCTGAGCAGCCAACACACCAGCCCAGACAGCTGCAAGTCACCATGGACGCTGAAGGCC 
TGGCGCTGCTGCTGCCGCCCGTCACCCTGGCAGCCCTGGTGGACAGCTGGCTCCGAGAGG 
ACTGCCCAGGGCTCAACTACGCAGCCTTGGTCAGCGGGGCAGGCCCCTCGCAGGCGGCGC 

1 0 TGTGGGCCAAATCCCCTGGGGTACTGGCAGGGCAGCCTTTCTTCGATGCCATATTTACCC 
AACTCAACTGCCAAGTCTCCTGGTTCCTCCCCGAGGGATCGAAGCTGGTGCCGGTGGCCA 
GAGTGGCCGAGGTCCGGGGCCCTGCCCACTGCCTGCTGCTGGGGGAACGGGTGGCCCTCA 
ACACGCTGGCCCGCTGCAGTGGCATTGCCAGTGCTGCCGCCGCTGCAGTGGAGGCCGCCA 
GGGGGGCCGGCTGGACTGGGCACGTGGCAGGCACGAGGAAGACCACGCCAGGCTTCCGGC 

1 5 TGGTGGAGAAGTATGGGCTCCTGGTGGGCGGGGCCGCCTCGCACCGCTACGACCTGGGAG 
GGCTGGTGATGTTGAAGGATAACCATGTGGTGCCCCCCGGTGGCGTGGAGAAGGCGGTGC 
GGGCGGCCAGACAGGCGGCTGACTTCGCTCTGAAGGTGGAAGTGGAATGCAGCAGCCTGC 
AGGAGGTCGTCCAGGCAGCTGAGGCTGGCGCCGACCTTGTCCTGCTGGACAACTTCAAGC 
CAGAGGAGCTGCACCCCACGGCCACCGCGCTGAAGGCCCAGTTCCCGAGTGTGGCTGTGG 

20 AAGCCAGTGGGGGCATCACCCTGGACAACCTCCCCCAGTTCTGCGGGCCGCACATAGACG 
TCATCTCCATGGGGATGCTGACCCAGGCGGTCCCAGCCCTTGATTTCTCCCTCAAGCTGT 
TTGCCAAAGAGGTGGCTCCAGTGCCCAAAATCCACTAGTCCTAAACCGGAAGAGGATGAC 
ACCGGCCATGGGTTAACGTGGCTCCTCAGGACCCTCTGGGTCACACATCTTTAGGGTCAG 
TGAACAATGGGGCACATTTGGCACTAGCTTGAGCCCAACTCTGGCTCTGCCACCTGCTGC 

25 TCCTGTGACCTGTCAGGGCTGACTTCACCTCTGCTCATCTCAGTTTCCTAATCTGTAAAA 
TGGGTCTAATAAAGGATCAACCAAAAAAAAAAAAAAAAAAAA 

AF033199 

CGGGGCATGCTGCTTCCCTTCACCTTCCACCATGATTGTAAGTTTCCTGAGGCCTCCCCA 
30 GGTGTGCTTCTGTACAGCCTGTGGAATGTTACCAAAGACGTTGGAAGAGGTGGCTATGGG 
ACATCACCTGGGAGAAGTGGAAGCAAATGGACACTGTTCAGAAGTCCATATACAGAAACA 
TACTTGGAAAAATATAGAAACCTGGTTTTGCTAGATGGGAAGCTTGCAGCTGGGGCCAAG 
ACATCAAGAGTAGAGCAGCAGGACATTTCAAAAGAAGATTAACTCAAAGATTAGAGATGG 
AAGAACTTGCAAAGAGAAAGTCTGTACCGGAAGAAATCTGGAAATCTAGAGGCCAGTTTA 
35 AGAATCAGCAGCTAAACAAGGAGAATAATCTAGGGCAAGAGATAGCTACCTGCACAAAAA 
TTCCTACCAGAAAAAGAGACATAGAATCTAATGAATTTGTGAAAAATTTTACTGTAAGAT 
CAATACTTGTTGCAGAACAGATAGATCCTATGGAAGAGAATTGTCATAAATATGGTACAT 
GTTGAAAGATGCTCAAACAAAACTCAGATTTAATTATACAAAGAAAGTATGATGGAAAAA 
AAAAAACCTTGTAAATATAGTGAATGTGGGAGAACCTTCAGAGGCCACATCACTCTTGTT 
40 CAGCATCAAATAACTCATTGTGGAGAGAGACCCTGTAAATGTACTGAGTGTAGAAAGGGA 
TTTAATCAGAGTTCCCACTTAAGAAATAATCAGAGAAAAACTCTTTCAGGAGAAAAGCCC 
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TACAAATGCAGTGAGTGTGGGAAGGCCTTCAGTTATTGCTTAGTTCTTAATCAACACCAG 
AGAATTCACAGTGGAGAGAAACCTTATGAGGGTACTGAATGTGGCAAGACATTCATTCAG 
TCGTACATACCTTACTCAGCATCAAAGAATTCACACACTGGTGAGAAGCCCTATACATGT 
CTTGAATGTGGAAGGCTTTTTAGTCAGAACACACATCTTACTCTACATCAGAGAATCCAT 
5 ACTGGAGAGAAACCTTATGAATGCAATGAATGTGGTAGGTCCTTTAGTCAGACTGCACAT 
CTTACTCAACATCAAAGAATGTATACAGGAGAAAAACTCTATGAATGTAATGAATGTGAG 
AAAGCCTTCCATGATCACTCAGCTCTTATTCAACATCATATTGTCCATACTGCAGAGAAA 
CCCTATGATATCATGACTGGGAAAACTTTCAGTTACTGTTCAGACCTCATTCAACATCAG 
AGAATGCACACTGGAGAGAAACCATACAAATGCAATGAATGTGGGAATGCCTTTAGTGAT 

10 TGTTCATCCCTTATTCAGCATCAAAGAACTCACACTGGAGAAGAGCCTTATGAATGTAAG 
CAATGTGGAAAAGCCTTTAGCAGAAGCACATACCTTACTCAACATCAGAGAAGTCACGCA 
GGAGAGAAACAGTATAAATGCAATGAATGTGAGAAAACTTTCAGCCTGAGTTCATTCCTT 
ACACAGCATATGAGGGTTCAGACTGGAGAAAAACCCTACAAATATAATGAATATGGAAAA 
GCTTTTAGTGACTGCTCAGGACATTTTCAGAGAACTCACACTGGAGAGAAGCCCTGTGAA 

15 TGTAATGACTGTGGGAAACCTTTCAGTTTCTGTTCAGCCCTAATTCAACATAAGAGAATT 
CATACCAGAAAGAAGCCCTGACTGTACCTTCATACCAGTAAATGCACTGACTGTGGAAAA 
GCCTTCAGTGATTGGTTAGCACTTGTTCAACATCAGATAACTCAACACTGGAGAAAAACC 
GTATAAATGTACTGAATGTGGAAAAGCCTTCAGTTGGAGTACAGACCTCAAAAATCACCA 
GAAAACTCATACTAGTGAAAAATCCTATAAATGTAATGAATGTAGAAAGGCCTTTAGTTA 

20 CTGCTCTGGTCTTATTCAATGTCAGGTCATTCATACTATAGAAAAACCTTATGAATACGG 
TAAATGTGGCAAAGCCTTTAGGCAGAGGACAGACCTTAAAAAACATCAGAAAATGCATAC 
CGAAGAGAAACCCTATGAATGTAATGAATGTGGGAAAGCCTTTAGCCAGAGCACATATCT 
TACAAAACACCAAAAAATTCATAGTGAAGAGAAATCAAATATACATACTGAGTGTGGGGA 
AACCATTAGACAAAACTCTTCTTTTTACAACAATAAAACCTCACACTGGAGAGTTCTCTG 

25 AATGCCTTAAGAATTTGGTTAATATGGAGACCCTTCCCAGGGAAACAGAAGGAGGATCGT 
GAAAACCGTTGACTACTTGAATGATCACATGGTTTAGTGGAGAGAGCATGATTCTGGGTT 
TTAAAAGTCATGGATCTCAATCTCAGCTCCTATTACTAACTAGATCTTTTACTTTGGGGT 
AAGTCACTTCATATCTTTAGGCCTTAATTTCCTCATCTGAAAACTGGAAGGCCTGACTTG 
ACTTGTTGAGCTTAAGATCCTCAATTATTATATTTACTAGGAATTCAAGTTTCTATAGAT 

30 GTGGTTCAGAATTGTGACTTATTTATTGTACATCAGGTGTGATTCACAAGTGAGCTTGTA 
GTAGTTATTAAGGAGTCAATAAAGATATGATATAAAAAAAAAAAAAAAAA 



AI688494 (IMAGE Clone ID: 2330499) 

CATTTCATCTTCATTGGATAGTGTTACATAGTAATATATTTATGTTTTCTTTTAATCATT 
35 TCATAACTTGGAAAATACTAACATAGTCAAAACTCTAGGGTAGGTGATACATGAGTTTCT 
GTAGTAATCTGGTTGGAGACATGTTGTAATTCTGTATATATATGTACATTTATCCCATGC 
ATGTTATGCCTAAACTAAGACGGATACCCCTGAATTAAGAGGTGCTGTTATACATTGACC 
AGGCTTAAGAATATCTCTTTAAAGTGTGTCGACATTTAATTGACCTTTGGAAGTTCATTC 
TGTTAATCATACTCAAAGTGCTAAAGCTATGGTTGACTGCTCTGGTGTTTTTATATTCAT 
40 TCGTGCTTTAGCATATAAATTCTTCAGCATAATTGCTACTTATTTAGCAAGAGTTTCCTT 
TATTTGAAAATGTGAGTTGTGCTTGTATTTTTGTGTCTTTCTTTCTTTCTTTCTTTTTTT 
AAACTTTGCTTCAGGCTGGGTAGTGGTAGAGGTTTGAATTAAAATGTTTTCCTGTCAGTA 
AAAAAAAAAAA 
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AL1 57459 

GAGCGAGCCCAGCAGCTTGCCCTTGACAGGTGGGGGCTGGCTGGGGCCTTAATGTGAAAA 
GACAGTGGCAGGCAGCTGGAGTAGAGCGAGCCCAGCAGCCCTAAAAGGCTGCCTTCATGG 
5 CCATCTAGCCCCAGTTCAGGGCAGCATCCATAGCCCACAAGCCAGCGTGGGTGGGGCGGG 
GGTGGTCCCACAGCTGGGTTCCACCTGAAGAGCCTCCGTGCCTCGGAGCAGGAGAGGCAG 
GCTATGGCTGTCACCCTCCCTCCTGCCTGTGTCCCAGTGAGAACTGACCTGAGTCCCCTT 
CCAAACCCAGACCCACCTCCTGCCCCAGGCCCACTGAAGCATGTTCCATTTCTAAAAAGC 
CCAGAGTTCAGTGTGTCCCAAGGAAAACCCAAAGTGGAGGTGCTCAGGTCCAGGGGAGTC 

1 0 CAGTGGGCAGGACCCTTGGCAGGCAAGCCCCTCCCTTCACTCCCAGGACCTACCTTCTGC 
TAGTAAAGGACTGGCTTCATTCTAATTATGGCCCACAGACTGCCCCGGAGACCTGGAGGA 
CAGCAGTGCTGGCACTTGGGTGTCCATGGGCCCGTCTGCCGGCTCTGCCTGTGCTGCAAG 
TGTTGGCCGTGGGTCCAGCCAACAACTCCCTACGTCCTGTGTGGGGCCCTGCCCAAGTGG 
ATGAGGCATTCCTTGAGGAGTATCATTTTCCCTGACAATCCCCATCACCTTTAGGGGTTC 

1 5 CCTGCTTGGCTCCTTTCCAGCTGAAAAACTAGACCTGTGCCATTGGGGAAGCTGGACAAA 
GTCTAGGGGGCCCGCCTGGTAGAGGGTCCCGGGAAGCTGGATCTGTCAGCCTCGGCCCTG 
AGGCCCCTGTTAACTCAAGACTGTGAGCTGCCTCTAGGTGGTCACGTCTGGGAGCTAGCT 
TGTATGGCTTCTGACCAGTATCAGGATTTCTGTTCTGAGAGCAGCGTGGGCAGCAAGGCA 
GGGCAGCCCAGAGGTGGCAGCGGCAGGCAATCTGGTCACTAGGTCTTTGTGATGCCAAAA 

20 ATAAAAGAGGGTGGGGTGGGTGCTTTCTGTTCCTCTGATTGGATGGAGTCCGCCAGCAGG 
CATGGGGCTACATTCCAGTGCCTGACTATAGGGAGGCACTCCTGATTCCATGGAGCAGCC 
CGGACTTTGAGAATGGGCTCTGGTTTGCGGGGGGCAGGCGTACCAGACTGCAAGACCCCC 
CAGTACCTCACCGTGCCAAATAGGAAGAGGTGGCCTTGGTGTAGCCAAATGGATCTTTTT 
AACAGTGTGCCTTTGGGGAGGGACCCATGTCCATGGCTTCGTTGAGGGCCATCCATATGC 

25 CAGCTGGGGGCCAGCCCACAGTGGCCATATTGGCTGCAGCAGGAATGGTGCCCACCTCGG 
CGAATTGAAGGGCTAAGAGTCCCAGATAGCTAGGCCAGAGCTGGAAGCAGACAGTAAGGG 
GAAGAGCTGCTCCCACAGGAGAGGGAGAGATTCCAGCTCACTGCGCAGCCTGGGAGGAGG 
CGTGGATCCTGGCACGCTGAGCCTCAGGCACCAGCCTCCCTGTGCTCGACAGCAAAGTCT 
TGACTCCTTCCTGCTGAGCACTGTGCTACCTTCACTGCTCCAAAGCCAGACTAACAGCTC 

30 TCCAAGCCCTTGGGGTGACTCGGCTTCCAGGAGCTGTTGGAGAAATGAGGATGTCTGTCC 
CTGTCTGCCTGGGCAGGCCAGATTCCTCCCCAGCAGCCGGGTCTCTCCAGACCCTGATTC 
GGTGCCTTTCTGTTTACCAGCTACTTCAATCCCAAAGTTTGAATCTGCAGATACCTTACT 
CCCAGCCACTTTGCCTTCTTACTGTGTTGTGTGTTTTTCCTGGTGCTTCAAGAGCGTGTG 
CAGGGCAAGTGCCGTCACTGGGAACTGCACCAGATGCTCAGACTTGGTTGTCTTATGTTT 

35 ACCAATAAATAAAAGTAGACTTTTTCTATTTTTATTTGCTGCTATTTGTGTGTGTGTTTG 
TGTTTGTGTAGCTAGGTATCTGGCACTTCTGACGATGCATTGTTGCTTTTTTCCCGAAGG 
TCCCGCAGGAACTGTGGCAATGGTGTGTGTGTGAAATGGTGTGTTAACCGCGTTTTGTTT 
GCTCCTGTATTGAATAGGAAGCAGTGGCCAGTCTGTCTTCCTTAGAGATGTTAGCATATT 
TTTATATGTATATATTTTGTACCAAAAAAGAGTGTTCCTTGTTTTGGTTACACTCGAAAT 

40 TCTGACCTAGCTGGAGAGGGCTCTGGGCCGAGAGCTTTCACTAAGGGGAGACTTCAGGGG 
AGGATCAAGCTTTGAACCAAAGCCAATCACTGGCTTGATTTGTGTTTTTTAATTAAAAAA 
AAAATCATTCATGTATGCCACTTCTAAAAAAAAAAAAAAAAAAAAAAAAA 
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GGCACGAGGCTGAGACCGGTGCGCCGCGCGCTAGTGGCCGCTCTTCCGCGGGCTAGCGGG 
CGGTGGGGGCGCCAGCAGCGCGGAAGGCGGGCACGCGGGCCATGGCTCCCTGGGCGGAGG 
CCGAGCACTCGGCGCTGAACCCGCTGCGCGCGGTGTGGCTCACGCTGACCGCCGCCTTCC 
5 TGCTGACCCTACTGCTGCAGCTCCTGCCGCCCGGCCTGCTCCCGGGCTGCGCGATCTTCC 
AGGACCTGATCCGCTATGGGAAAACCAAGTGTGGGGAGCCGTCGCGCCCCGCCGCCTGCC 
GAGCCTTTGATGTCCCCAAGAGATATTTTTCCCACTTTTATATCATCTCAGTGCTGTGGA 
ATGGCTTCCTGCTTTGGTGCCTTACTCAATCTCTGTTCCTGGGAGCACCTTTTCCAAGCT 
GGCTTCATGGTTTGCTCAGAATTCTCGGGGCGGCACAGTTCCAGGGAGGGGAGCTGGCAC 

1 0 TGTCTGCATTCTTAGTGCTAGTATTTCTGTGGCTGCACAGCTTACGAAGACTCTTCGAGT 
GCCTCTACGTCAGTGTCTTCTCCAATGTCATGATTCACGTCGTGCAGTACTGTTTTGGAC 
TTGTCTATTATGTCCTTGTTGGCCTAACTGTGCTGAGCCAAGTGCCAATGGATGGCAGGA 
ATGCCTACATAACAGGGAAAAATCTATTGATGCAAGCACGGTGGTTCCATATTCTTGGGA 
TGATGATGTTCATCTGGTCATCTGCCCATCAGTATAAGTGCCATGTTATTCTCGGCAATC 

1 5 TCAGGAAAAATAAAGCAGGAGTGGTCATTCACTGTAACCACAGGATCCCATTTGGAGACT 
GGTTTGAATATGTTTCTTCCCCTAACTACTTAGCAGAGCTGATGATCTACGTTTCCATGG 
CCGTCACCTTTGGGTTCCACAACTTAACTTGGTGGCTAGTGGTGACAAATGTCTTCTTTA 
ATCAGGCCCTGTCTGCCTTTCTCAGCCACCAATTCTACAAAAGCAAATTTGTCTCTTACC 
CGAAGCATAGGAAAGCTTTCCTACCATTTTTGTTTTAAGTTAACCTCAGTCATGAAGAAT 

20 GCAAACCAGGTGATGGTTTCAATGCCTAAGGACAGTGAAGTCTGGAGCCCAAAGTACAGT 
TTCAGCAAAGCTGTTTGAAACTCTCCATTCCATTTCTATACCCCACAAGTTTTCACTGAA 
TGAGCATGGCAGTGCCACTCAAGAAAATGAATCTCCAAAGTATCTTCAAAGAATAAATAC 
TAATGGCAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 
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